**Citation -** 林頌堅. (2003). 基於詞語抽取的圖書與資訊學刊研究主題分析. 圖書與資訊學刊(47), 15-35.

**Keyword -** [[:domain_analysis]], [[study:keyword_extraction]],  [[study:latend_semantic_indexing]]

**Tag -** [[tag>domain analysis]], [[tag>latend semantic indexing]], [[tag>journal article]], [[tag>clustering analysis]]


== 基於詞語抽取的圖書與資訊學刊研究主題分析 ==

=== 領域分析 ===
領域分析(domain analysis)：以了解某一學科內，普遍的研究主題與知識架構。使研究人員了解學科發展的現況。

=== 文獻主題辨識 ===
概念抽取
  - 對期刊文獻的中英文摘要進行關鍵字抽詞
  - 對抽詞結果以[[study:latend_semantic_indexing|隱含語義索引(latend semantic indexing)]]進行詞語間的相關程度統計
  - 對抽詞相關結果以Cliques叢集演算法，進行[[study:clustering_analysis|叢集分析(clustering analysis)]]，形成概念集合。一個概念集合可以視為集合中各個語詞特徵的總和
  - 可以透過隱含語義索引，由概念集合指引至相關文獻

對核心概念的判斷準則
  - 目的：找出最明顯重要的概念，作為主題。
  - 核心概念集合：與其他概念集合相關的程度。與越多的概念相關，則此概念集合越可能是重要的研究主題。
  - 對核心概念集合進行Ward叢集分析，找出研究主題。

由研究主題回溯與主題相關的論文
  - 因為：研究主題的特徵視為概念集合特徵的總和
  - 可以透過隱含語義索引，由概念集合總和指引至相關文獻
  - 對主題內的相關文獻進行作者與發表時間的描述統計

=== 主題關係的視覺化呈現 ===
  - 以叢集分析與[[study:multi-dimensioanl_scaling|多維量尺法(multi-dimensioanl scaling)]]呈現主題間的「共文獻關係」，以呈現主題間的相關程度。

== related references ==
  * [[Hjorland, B., & Albrechtsen, H. (1995). Toward a new horizon in information science|Hjorland, B., & Albrechtsen, H. (1995). Toward a new horizon in information science: Domain-analysis.]] Journal of the American Society for Information Science, 46(6), 400-425.
  * [[Deerwester, S., Dumais, S., Landauer, T., Furnas, G., & Harshman, R. (1990). Indexing by latent semantic analysis|Deerwester, S., Dumais, S., Landauer, T., Furnas, G., & Harshman, R. (1990). Indexing by latent semantic analysis.]] Journal of the American Society of Information Science, 41(6), 407.