January 20, 2004
Categorizing Blogs?
by at 8:17 AM
Topix.net currently includes a handful of blogs that we've editorially selected as being newsy enough to fit with the other material we have. This includes stuff like Dan Gillmor's eJournal, Lawrence Lessig, TechDirt, Executive Summary, and so on.
Originally we thought we'd have the full blogiverse in the categorization engine. We actually had it running internally for a while. But not much was coming out of the blogs, and the stuff that did didn't look very good compared to the other stories.
Our categorizer likes references to very specific named entities -- at the local level, streets, jails, hospitals, parks, rivers, and so on. For national news we're scanning for politician names; for world news, references to political leaders and geographical features. But the AI wasn't finding much to grab onto in the blogs we crawled.
I guess this is because a lot of blog material is informal discussion that is often more follow-up to news posted elsewhere than direct reporting. Blog posts are the feedback to the story, not the story itself. And the chatty tone often omits the who-what-where-when-why of a news story or press release that makes them self-contained entities.
Perhaps blogs need a system like Dave Winer's categorization scheme, although, despite my ODP background, I'm skeptical of ad-hoc user generated taxonomies. Or maybe the rumors that a new system like Kinja will make sense of it all will pan out.
Recent Entries
- Headline News: Topix on CNN.com
- Topix Cracks the Top 20 & Gets a New Suit
- Inviting Readers to the Party: Expanding the Definition of News
- Topix Grows 81%, According to Hitwise
- What's Missing from Your Local News?
- 500 Editors and Counting
- Reinventing Topix: Topix.Com(munity)
- Topix shows you "How To" at BlogHer
- SXSW Talk: When Communities Attack
- What can you do with one million people?
Archives
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- August 2006
- July 2006
- June 2006
- May 2006
- April 2006
- March 2006
- February 2006
- January 2006
- November 2005
- October 2005
- September 2005
- August 2005
- June 2005
- May 2005
- April 2005
- March 2005
- February 2005
- January 2005
- December 2004
- November 2004
- October 2004
- September 2004
- August 2004
- July 2004
- June 2004
- May 2004
- April 2004
- March 2004
- February 2004
- January 2004
Powered by Movable Type
About Topix
- About Us
- Advertise
- Contact Us
- FAQ (General)
- Feedback
- Jobs
- Press Room
- Privacy Policy
- Terms of Service
Blogroll
- Rich Skrenta
- Mike Markson
- Blake Williams
- Chris Zaharias
- alarm:clock
- John Battelle
- Susan Mernit
- Micro Persuasion
- Greg Linden
- Jeremy Zawodny
- Search Engine Watch
- ResourceShelf
- Jeff Jarvis
- Traffick
- TechCrunch
- PaidContent
- Allen Morgan
Topix
