• How Litigation Chokes Info Flow

    Updated: 2010-08-31 06:44:22
    Short honk: I noted this article “Google Code Blog: An update on JavaOne.” My hunch is that Google is pulling out because of the dust up between Oracle and Google. If I am right, this is one more example of legal eagles’ stifling information flow. Wonderful. Stephen E Arnold, August 31, 2010 Freebie.

  • Yahoo Essay Sets a Record Straight

    Updated: 2010-08-31 06:34:18
    Anyone interested in the pay to play angle for search will want to read “Bubble Blinders: The Untold Story of the Search Business Model.” In addition to correcting some factual errors from high profile search wizards, the write up underscores how tough it is for one company to see that a big opportunity exists. More [...]

  • Attensity and Its Positioning

    Updated: 2010-08-31 06:23:02
    I found it notable that Attensity, a company known for its “deep extraction” technology, authored a feature in Mashable. Mashable is a Web publication that touches the throbbing heart of the Web world and its denizens. I cannot recall a company with roots in the arcane world of content processing and the government information projects [...]

  • Google-Phrenia: A Social Adjustment?

    Updated: 2010-08-31 06:12:57
    I am not a social goose. The goings at Orkut are a mystery to me. I think I signed up, saw lots of Brazilian info, and abandoned ship. No knock against Orkut or Brazil. I just don’t want digital pals unless those pals have a beak and feathers. I read with semi-interest the write up [...]

  • Exclusive Podcast Interview: David Fishman

    Updated: 2010-08-31 06:01:06
    I did a follow up interview with David Fishman, Lucid Imagination’s vice president of marketing. In this 10 minute discussion, the topic is the technical plumbing of Lucene/Solr. In the podcast, Mr. Fishman describes the free converter SolrCELL, the faceting capabilities of the Lucene/Solr system, and the Carrot-2 clustering software. If you want to know [...]

  • Today's Search Term: Stemming

    Updated: 2010-08-30 13:29:00
    : Enterprise Search The business and technology of corporate search Home About Archives Subscribe Search Terms Main August 30, 2010 Today's Search Term : Stemming stemming Related : Terms lemmatization normalize Search engines use stemming as a means to determine the root of a given written word . Using a program or algorithm all of the affixes to a word prefix and or suffix in the English language are removed , leaving the root word . By implementing the rules of the given language obstacles such as third- person singular present as cries is of the verb cry in the English language can be accurately indexed . Stemmers become harder to design as the rules of the target language becomes more complex . For example , some languages have more verb and pronoun forms . Other languages do not

  • Search Terms

    Updated: 2010-08-30 12:10:00
    Enterprise Search The business and technology of corporate search Home About Archives Subscribe There's an Ant on your Southwest Leg Main Today's Search Term : Stemming August 30, 2010 Search Terms NIE maintains a  Glossary Enterprise Search Terms related to the Business and Technology of Search on our site , which you can browse at your convenience . This is an active list , and we welcome your suggestions and additions Now we're going to select and post one of these each day or so in the blog . 0160 Some may be familiar but we hope some will be new to you . Enjoy Posted by Catherine at 05:10:00 AM in Business Enterprise search Search Term Definitions and Glossary TrackBack TrackBack URL for this : entry http : www.typepad.com services trackback 6a00d8341c84cf53ef0133f2a292fb970b Listed

  • Leadspace Prepping for a Take Off?

    Updated: 2010-08-30 06:55:25
    One of my readers in Israel forwarded to me an update on Leadspace Ltd., founded by Amnon Mishor and Yaron Karasik. Originally named Data Essence, Leadspace is rumored to have applied its proprietary semantic technology to the tough problem of finding information. From what I can gather, the system taps features of existing linked data, [...]

  • Google Cries over Spilled Milk

    Updated: 2010-08-30 06:25:14
    Google has accused the government of favoring Microsoft. This is a recent news revelation on WebProNews.com’s “Google Cries Foul Over California Email Contract” citing Google’s inability to win the $60 million contract, which has gone to CompuCom Systems, an IT outsourcing company with ties to Microsoft. Even though Google is based in California, the City of [...]

  • NLP-Based Service Swingly Now at Bat

    Updated: 2010-08-30 06:22:53
    The new service Swingly strives to offer web enthusiasts one of the first web answer engines. The service works by taking text from a variety of sources on the web which could include social media or news articles and compiling them in web databases that can be used for answering questions. Semantic Web gives information [...]

  • Will Microsoft Be Able to Succeed Online?

    Updated: 2010-08-30 06:11:22
    It has been nearly a lost decade for Microsoft online, generating no returns on its Internet ventures. The ZDNet.com article “Microsoft’s Lost Eight Years Online: More Than $6 Billion Down the Tubes” discusses the financial hits of Microsoft’s Internet follies, comparing its fiscal reports of the past ten years. Microsoft has delivered a profit here [...]

  • Google and Its Global Street View Experiences

    Updated: 2010-08-30 06:02:41
    Special to Beyond Search Technological innovative ideas have transformed our societies and lifestyles for better since time immemorial, also affecting the social norms and values. Such changes, as all changes do by default, go through a period of resistance, before they are finally embraced. The recent Google Street View controversy in Germany is a perfect example, [...]

  • Chinese Anti-fraud Activist Attacked

    Updated: 2010-08-29 18:36:00
    Science: Assailants Attack China's Science WatchdogCRI-English: Anti-fraud Activist Attacked in BeijingFang Zhouzi Attacked near his Residence in Beijing

  • Alex Smola starts a blog

    Updated: 2010-08-25 00:44:59
    Adventures in Data Land.

  • Boosted Decision Trees for Deep Learning

    Updated: 2010-08-23 18:18:53
    About 4 years ago, I speculated that decision trees qualify as a deep learning algorithm because they can make decisions which are substantially nonlinear in the input representation. Ping Li has proved this correct, empirically at UAI by showing that boosted decision trees can beat deep belief networks on versions of Mnist which are [...]

  • KDD 2010

    Updated: 2010-08-23 01:39:07
    There were several papers that seemed fairly interesting at KDD this year. The ones that caught my attention are: Xin Jin, Mingyang Zhang, Nan Zhang, and Gautam Das, Versatile Publishing For Privacy Preservation. This paper provides a conservative method for safely determining which data is publishable from any complete source of information (for example, [...]

  • The Spectrum of Time Series Forms

    Updated: 2010-08-22 13:22:20
    I've been playing around with time series data recently. Seeing the many forms that these take, I was planning a post describing a bestiary of time series. This, it turns out, is too much work, so here I have collected...

  • Rob Schapire at NYC ML Meetup

    Updated: 2010-08-22 03:10:09
    I’ve been wanting to attend the NYC ML Meetup for some time and hope to make it next week on the 25th. Rob Schapire is talking about “Playing Repeated Games”, which in my experience is far more relevant to machine learning than the title might indicate.

  • Next Meeting: Triangle HUG

    Updated: 2010-08-19 01:58:44
    The next TriHUG meeting has been announced:  Sept. 14.  There will be two speakers: Wei Wei on Practical Hadoop Security and Me on Hadoop and Lucene and Solr. For more info and to RSVP, see Triangle Hadoop Users Group.

  • SIGIR 2010 Query Representation and Understanding

    Updated: 2010-08-16 04:03:44
    : SIGIR 2010 : Query Representation and Understanding Overview Proceedings Invited Speakers Accepted Papers Workshop Program Call For Papers Important Dates Organizers Program Committee Overview Understanding the user's intent or information need that underlies a query has long been recognized as a crucial part of effective information retrieval . Despite this , retrieval models , in general , have not focused on explicitly representing intent , and query processing has been limited to simple transformations such as stemming or spelling correction . With the recent availability of large amounts of data about user behavior and queries in web search logs , there has been an upsurge in interest in new approaches to query understanding and representing intent . This workshop has the goal of

  • LNRE

    Updated: 2010-08-11 14:09:00
    Here is a good tutorial with Matlab examples about Statistical Estimation for Large Numbers of Rare Events (LNRE).

  • NLP Book

    Updated: 2010-08-07 13:33:00
    : skip to main skip to sidebar The Lousy Linguist Notes on linguistics and cognition Saturday , August 7, 2010 NLP Book Alias-i has just released a draft version of a book based on their NLP suit LingPipe Our goal is to produce something with a little more breadth and depth and much more narrative structure than the current LingPipe tutorials . Something that a relative Java and natural language processing novice could work through from beginning to end , coming out with a fairly comprehensive knowledge of LingPipe and a good overview of some aspects of natural language processing Enjoy Posted by Chris at 9:33 AM Labels : lingpipe NLP 0 comments : Alias-i has just released a draft version of a book based on their NLP suit LingPipe Our goal is to produce something with a little more breadth

Current Feed Items | Previous Months Items

Jul 2010 | Jun 2010 | May 2010 | Apr 2010 | Mar 2010 | Feb 2010