no way to compare when less than two revisions
Differences
This shows you the differences between two versions of the page.
— | study:noll_m._g._yeung_a._2009_._telling_experts_from_spammers [2018/08/27 02:22] (current) – created - external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | == Telling experts from spammers: Expertise ranking in folksonomies / Noll & Yeung (2009) == | ||
+ | |||
+ | **Citation** - Noll, M. G., & Yeung, A. (2009). Telling experts from spammers: Expertise ranking in folksonomies. \\ 區別專家與灌水者:以群眾分類學進行專業評比 | ||
+ | |||
+ | **Keyword** - [[: | ||
+ | |||
+ | * 研究對象: | ||
+ | ** (行為) tagging: Freely annotating resources with keywords | ||
+ | ** (行為的目的): | ||
+ | ** (協同標記平台): | ||
+ | *** (平台上的利用行為): | ||
+ | ** (行為導致的現象)tagging result phenomena : bottom-up “categorization” by end users, aka “folksonomy” | ||
+ | * 研究問題現況: | ||
+ | * 研究目標: | ||
+ | |||
+ | |||
+ | * 設計原則(研究假定): | ||
+ | ## (1)越專業的人與所分享資源的品質越好; | ||
+ | ## (2)專家比其他人更早發現有用的資源 \\ **Discoverers vs. followers**: | ||
+ | |||
+ | |||
+ | * (研究設計)演算法設計: | ||
+ | ** 根據在 IR 研究中, 以專家辨識作為改善檢索相關性的相關研究成果。這類似引文分析的作法。 | ||
+ | * (研究檢驗分析) : We carry out experiments on both simulated and real-world data sets obtained from Delicious, and show that SPEAR is able to detect the difference between different types of experts, and is more resistant to spammers than other methods. | ||
+ | |||
+ | |||
+ | * SPEAR – SPamming-resistant Expertise Analysis and Ranking [防灌水專業性分析排序法] | ||
+ | ** 基於[超文本連入主題搜尋]演算法 Based on the HITS (Hypertext Induced Topic Search) algorithm | ||
+ | *** Hubs [樞紐]: 指向許多品質優良頁面的頁面 pages that points to good pages | ||
+ | *** Authorities [權威來源]: | ||
+ | ** 專業性(Expertise)與品質(Quality)的概念類似於樞紐與權威 | ||
+ | *** 專家是樞紐 Users are hubs – we find useful pages through them | ||
+ | *** 品質優良的頁面是權威 Pages are authorities – provide relevant information | ||
+ | ** 不同之處: | ||
+ | |||
+ | |||
+ | * 演算法 | ||
+ | |||
+ | |||
+ | * 實驗設計 Experimental | ||
+ | ** 在真實世界系統中,放入模擬的使用者 Workaround: Inserting simulated users into real-world data from Delicious.com and check where they end up after ranking | ||
+ | ** 比較 Delicious.com 中 50 tags ,當中包含了 515,000 真的使用者、71, | ||
+ | ** 模擬使用者的變項 Probabilistic simulation, simulated users generated with four parameters | ||
+ | *** P1: 使用者收錄的書籤數量 Number of user’s bookmarks – active or inactive user? | ||
+ | *** P2: 網頁的新穎性 Newness – fraction of Web pages not already in data set | ||
+ | *** P3: 使用者收錄網頁的時間偏好 Time preference – discoverer or follower? | ||
+ | *** P4: 網頁的品質 Document preference – high quality or low quality? | ||
+ | ** 區分六種不同使用者類型 | ||
+ | *** 技客 Geek – 收錄大量高品質網頁,發掘者(跨領域研究者) \\ lots of high quality documents, discoverer Distinguished Researcher) | ||
+ | *** 老鳥 Veteran – 收錄高品質網頁,發掘者(教授) \\ high quality documents, discoverer (Professor) | ||
+ | *** 菜鳥 Newcomer – 收錄高品質網頁,跟隨者(博士生) \\ high quality documents, follower (PhD student) | ||
+ | *** 氾濫 Flooder – 隨機的收錄大量網頁,跟隨者 \\ lots of random documents, follower (found in Delicious) | ||
+ | *** 促銷者 Promoter – 主要收錄自己的網頁,發掘者 \\ some documents (most are his own), discoverer (found in Delicious) | ||
+ | *** (鄉民)特洛伊人 Trojan – 收錄少數網頁,跟隨者 \\ some documents, follower (next-gen spammer) | ||
+ | ** 比較三種不同演算法的成效 | ||
+ | *** SPEAR | ||
+ | *** HIT | ||
+ | *** frequency count ranking algorithm, FREQ, | ||
+ | |||
+ | |||
+ | * 研究結果 | ||
+ | ** SPEAR 較另兩種演算法,更能有效的區別出三種不同類型的Spammer | ||
+ | |||
+ | == Note == | ||
+ | 這篇文章在定義上,混同了 folksonomy 與 collaborative tagging 。這可能會有一些理論上的爭議,但若使用 folksonomy 是一種現象的定義則可。作者的 folksonomy 比較像是 collaborative-tagging-graph 。 | ||
+ | |||
+ | == Metadata/ | ||
+ | |||
+ | {{backlinks> | ||
+ | {{tag> | ||
+ | |||