study:noll_m._g._yeung_a._2009_._telling_experts_from_spammers

Telling experts from spammers: Expertise ranking in folksonomies / Noll & Yeung (2009)

Citation - Noll, M. G., & Yeung, A. (2009). Telling experts from spammers: Expertise ranking in folksonomies.
區別專家與灌水者：以群眾分類學進行專業評比

Keyword - folksonomy, social network analysis

研究對象: del.icio.us 網路社群當中的網路資源 tagging 行為
- (行為) tagging: Freely annotating resources with keywords
- (行為的目的): self organizing resources, sharing, self-promotion,…
- (協同標記平台): 讓網友們自己進行資源關鍵詞註記的網路服務與社群平台 (e.g., delicious.com, flickr.com )
  - (平台上的利用行為): 搜尋相關資源(relevant resources), 搜尋相關專家(experts in particular domain)
- (行為導致的現象)tagging result phenomena : bottom-up “categorization” by end users, aka “folksonomy”
研究問題現況: 目前的排序只能根據數量與頻率, 無法區分專業性標記與大量灌水性標記行為
研究目標: 設計新的演算法 SPEAR (SPamming-resistant Expertise Analysis and Ranking [防灌水專業性分析排序法]), 此方法區分專業家與灌水者, 進而改善搜尋的相關性。

設計原則(研究假定): 使用者在特定主題的專業性程度, 主要取決於：
We propose that the level of expertise of a user with respect to a particular topic is mainly determined by two factors: (1) there should be a relationship of mutual reinforcement between the expertise of a user and the quality of a resource; and (2) an expert should be one who tends to identify useful resources before other users discover them.
1. (1)越專業的人與所分享資源的品質越好;
  Mutual reinforcement of user expertise and document quality: Expert users tend to have many high quality documents, and high quality documents are tagged by users of high expertise.
2. (2)專家比其他人更早發現有用的資源
  Discoverers vs. followers: Expert users are discoverers – they tend to be the first to bookmark and tag high quality documents, thereby bringing them to the attention of the user community. Think: researchers in academia.

(研究設計)演算法設計: graph-based algorithm (網絡關係為基礎的演算法)
- 根據在 IR 研究中, 以專家辨識作為改善檢索相關性的相關研究成果。這類似引文分析的作法。
(研究檢驗分析) : We carry out experiments on both simulated and real-world data sets obtained from Delicious, and show that SPEAR is able to detect the difference between different types of experts, and is more resistant to spammers than other methods.

SPEAR – SPamming-resistant Expertise Analysis and Ranking [防灌水專業性分析排序法]
- 基於[超文本連入主題搜尋]演算法 Based on the HITS (Hypertext Induced Topic Search) algorithm
  - Hubs [樞紐]: 指向許多品質優良頁面的頁面 pages that points to good pages
  - Authorities [權威來源]: 被許多優良頁面連結的頁面 pages that are pointed to by good pages
- 專業性(Expertise)與品質(Quality)的概念類似於樞紐與權威
  - 專家是樞紐 Users are hubs – we find useful pages through them
  - 品質優良的頁面是權威 Pages are authorities – provide relevant information
- 不同之處: 只有使用者(專家)可以指向文件(權威來源)，而不能反轉這種關係。

演算法

實驗設計 Experimental
- 在真實世界系統中，放入模擬的使用者 Workaround: Inserting simulated users into real-world data from Delicious.com and check where they end up after ranking
- 比較 Delicious.com 中 50 tags ，當中包含了 515,000 真的使用者、71,300 實際上的頁面、2,190,000 實際上的書籤
- 模擬使用者的變項 Probabilistic simulation, simulated users generated with four parameters
  - P1: 使用者收錄的書籤數量 Number of user’s bookmarks – active or inactive user?
  - P2: 網頁的新穎性 Newness – fraction of Web pages not already in data set
  - P3: 使用者收錄網頁的時間偏好 Time preference – discoverer or follower?
  - P4: 網頁的品質 Document preference – high quality or low quality?
- 區分六種不同使用者類型
  - 技客 Geek – 收錄大量高品質網頁，發掘者(跨領域研究者)
    lots of high quality documents, discoverer Distinguished Researcher)
  - 老鳥 Veteran – 收錄高品質網頁，發掘者(教授)
    high quality documents, discoverer (Professor)
  - 菜鳥 Newcomer – 收錄高品質網頁，跟隨者(博士生)
    high quality documents, follower (PhD student)
  - 氾濫 Flooder – 隨機的收錄大量網頁，跟隨者
    lots of random documents, follower (found in Delicious)
  - 促銷者 Promoter – 主要收錄自己的網頁，發掘者
    some documents (most are his own), discoverer (found in Delicious)
  - (鄉民)特洛伊人 Trojan – 收錄少數網頁，跟隨者
    some documents, follower (next-gen spammer)
- 比較三種不同演算法的成效
  - SPEAR
  - HIT
  - frequency count ranking algorithm, FREQ,

研究結果
- SPEAR 較另兩種演算法，更能有效的區別出三種不同類型的Spammer

Note

這篇文章在定義上，混同了 folksonomy 與 collaborative tagging 。這可能會有一些理論上的爭議，但若使用 folksonomy 是一種現象的定義則可。作者的 folksonomy 比較像是 collaborative-tagging-graph 。

Metadata/Backlinks

folksonomy

Note@XXC

User Tools

Telling experts from spammers: Expertise ranking in folksonomies / Noll & Yeung (2009)

Note

Metadata/Backlinks