Deadline for HCOMP 2011 extended: Submission due on April 29th

Due to a significant number of requests, and a number of conflicts with other conferences and workshops, we decided to extend the submission deadline for HCOMP 2011. The new deadline is April 29th.

If you want to know more, you can see the call for papers and workshop announcement.

Video from NYC Crowdsourcing Meetup

On April 13th, we hosted at NYU Stern the NYC Crowdsourcing Meetup. For those who missed it, you can now download an audio-only podcast version or watch the video from the event together with the slide presentations:


The speakers at the event:
  • John Horton, Staff Economist of oDesk. John talked on issues of matching employers with contractors in an online marketplace. Specifically he described mechanisms for forcing contractors to give an accurate description of their skills, avoiding issues of over-tagging a profile with irrelevant keywords or over-claiming qualifications.
  • Amanda Michel, Director of Distributed Reporting at ProPublica. Amanda talked about the crowdsourcing efforts of ProPublica, and how they use the crowd to enable better journalistic investigation of topics they are researching. At some point during the presentation, Amanda quoted from one of their studies "ProPublica pulled a random sample of 520 of the roughly 6,000 approved projects to examine stimulus progress around the country. That sample is large enough to estimate national patterns with a margin of error of plus or minus 4.5 percentage points." Honestly, a tear came down my eye when I compared that with the corresponding practices of Greek newsrooms that typically operate with samples of n=1 or n=0.
  • Todd Carter, CEO and Co-Founder of Tagasauris. Todd described Tagasauris, a system for annotating and tagging media files. Todd described the annotation effort for Magnum Photos, (sample photos in their collection include the Afghan refugee girl, Merilyn Monroe on top of the vent, and many other iconic photos). A highlight was the discovery of a "lost" set of images from the shooting of the movie "American Graffiti". These images, shot by Dennis Stock, were in the Magnum archive but were not possible to find as they were lacking any tags and description. After the annotation effort from Tagasauris, the lost set of photos were re-discovered.
  • Panos Ipeirotis, representing AdSafe Media. I talked about our efforts in AdSafe, on using crowdsourcing in order to create machine learning systems for classifying web pages.
It was a lively and successful event. If there is enough interest and participants, I think this is an event that can be repeated periodically.

NYC Crowdsourcing Meetup: April 13th, 6.30pm

Join us for its first ever New York City Crowdsourcing meetup hosted by NYU and sponsored by CrowdFlower:


Pizza, beer, and thought provoking conversation about the future of work. Come listen, ask, and debate how crowdsourcing is changing everything from philanthropy and urban planing to creative design and enterprise solutions.

Confirmed Speakers:

  • Lukas Biewald, CEO and Co-Founder of CrowdFlower
  • Todd Carter, CEO and Co-Founder of Tagasauris
  • John Horton, Chief Economist of oDesk
  • Panos Ipeirotis, Associate Professor at Stern School of Business, NYU
  • Amanda Michel, Director of Distributed Reporting at ProPublica
  • Bartek Ringwelski, CEO and Co-Founder of SkillSlate
  • Trebor Scholz, Associate Professor in Media & Culture at The New School University

Tutorial on Crowdsourcing and Human Computation

Last week, together with Praveen Paritosh from Google, we presented a 6-hour tutorial at the WWW 2011 conference, on crowdsourcing and human computation. The title of the tutorial was "Managing Crowdsourced Human Computation".

My slides from the tutorial are available now on Slideshare:




Once Praveen gets clearance from Google, we will post his slides as well.

Judging from all the crap that I get to review lately, I was getting pessimistic about the quality of research on crowdsourcing. However, while preparing the tutorial, I realized the massive amount of high-quality research that is being published. We had 6 hours for the tutorial, and we did not have enough time to cover many really interesting papers. I had to refer people to other, more "specialized" tutorials (e.g., on linguistic annotation, on search relevance, etc), which I mention at the end of the slides.

Special thanks go to my PhD student, Jing Wang, for her slides on market design, Matt Lease for his excellent list of pointers for crowdsourcing resources, Omar Alonso for his tutorial slides on crowdsourcing for search relevance, Alex Quinn and Ben Bederson for their survey on human computation, and Winter Mason for sharing his slides from his CSDM keynote. And all the other researchers for making crowdsourcing and human computation an exciting field for research!

Last but not least: Luis von Ahn with Edith Law will be presenting another tutorial on human computation during AAAI, in San Francisco on August 8th. We will be organizing the HCOMP 2011 workshop in conjunction with AAAI as well! The submission deadline is April 22nd! Do not forget to submit!

An ingenious application of crowdsourcing: Fix reviews' grammar, improve sales

I have been doing research on the economic impact of product reviews for a while. One thing that we have noticed is that the quality of the reviews can have an impact on product sales, independently of the polarity of the review.

High-quality reviews improve product sales

A well-written review tends to inspire confidence about the product, even if the review is negative. Typically such reviews are perceived as objective and thorough. So, if we have a high-quality, but negative, review this may serve as a guarantee that the negative aspects of the product are not that bad after all. For example, a negative review such as "horrible battery life... in my tests battery lasts barely longer than 24 hours..." may be perceived as positive by other customers that consider a 24-hour batter life to be more than sufficient.

In our recent (award-winning) WWW2011 paper "Towards a Theory Model for Product Search", we noticed that demand for a hotel increases if the online reviews on TripAdvisor and Travelocity are well-written, without spelling errors; this holds no matter if the review is positive or negative. In our TKDE paper "Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics", we observed similar trends for products sold and reviewed on Amazon.com.

And what can we do knowing this?

Being in a business school, these findings were considered informative but not deeply interesting. Do not forget, the focus of researchers in business schools is centered on causality and on policy-making. Yes, we know that it is important for the reviews to be well-written and informative if we want the product to sell well. But if we cannot do anything about this, it is not deeply interesting. It is almost like knowing that during the cold months the demand for summer resorts drops!

But here comes the twist...

The crowdsourcing solution

Last week, over drinks during the WWW conference, I learned about a fascinating application of crowdsourcing that attacked exactly this issue.

An online retailer noticed that indeed products with high-quality reviews are selling well. So, they decided to take action. They used Amazon Mechanical Turk to improve the quality of its reviews. Using the Find-Fix-Verify pattern, they used Mechanical Turk to examine a few millions of product reviews. (Here are the archived versions of the HITs: Find, Fix, Verify... and if you have not figured out the firm name by now, it is Zappos :-) ) For the reviews with mistakes, they fixed the spelling and grammar errors! Thus they effectively improved the quality of the reviews on their website. And, correspondingly, they improved the demand for their products!

While I do not know the exact revenue improvement, I was told that it was substantial. Given that the e-tailer spent at least 10 cents per review, and that they examined approximately 5 million reviews, this is an expense of a few hundred thousand dollars. (My archive on MTurk-Tracker kind of confirms these numbers.) So, the expected revenue improvement should have been at least a few million dollars!

Ethical? I would say yes. Notice that they are not fixing the polarity or the content of the reviews. They just change the language to be correct and error-free. I can see the counter-argument that the writing style allows us to judge if the review is serious or not. So, artificially improving the writing style may be considered as interference with the perceived objectivity of the user-generated reviews.

But is it ingenious? Yes! It is one of these solutions that is sitting in front of you but you just cannot see it. And this is what makes it ingenious.

Crowdsourcing goes professional: The rise of the verticals

Over the last few months, I see a trend. Instead of letting end-users interact directly with the crowd (e.g., on Mechanical Turk), we see a rise of the number of solutions that target a very specific vertical.
Add services like Trada for crowd-optimizing paid advertising campaigns, uTest for crowd-testing software applications, etc. and you will see that for most crowd applications there is now a professionally developed crowd-app.

Why do we see these efforts? This is the time that most people realize that crowdsourcing is not that simple. Using Mechanical Turk directly is a very costly enterprise and cannot be done effectively by amateurs: The interface needs to be professionally designed, quality control needs to be done intelligently, and the crowd needs to be managed in the same way that any employee is managed. Most companies do not have time or the resources to invest in such solutions. So, we see the rise of such verticals that address the most common tasks that were accomplished on Mechanical Turk.

(Interestingly enough, if I remember correctly, the rise of vertical solutions was also a phase during web search. In the period in which AltaVista started being spammed and full of irrelevant results, we saw the rise of topic-specific search engines that were trying to eliminate the problems of polysemy by letting you search only for web pages within a given topic.)

For me, this is the signal that crowdsourcing will stop being the fad of the day. Amateurish solutions will be shunned, and most people will find it cheaper to just use the services of the verticals above. Saying "oh, I paid just $[add offensively low dollar amount] to do [add trivial task] on Mechanical Turk" will stop being a novelty and people will just point to a company that does the same thing professionally and in a large scale.

This also means that the crowdsourcing space will become increasingly "boring." All the low-hanging fruits will be gone. Only people that are willing to invest time and effort in the long term will get into the space. 

And it will be the time that we will get to separate the wheat from the chaff.

Uncovering an advertising fraud scheme. Or "the Internet is for porn"

This summary is not available. Please click here to view the post.

Do Mechanical Turk workers lie about their location?

A few weeks back, Dahn Tamir graciously allowed me to take a peek at the data that he has been gathering about this workers on Mechanical Turk. He has assigned tasks over time to more than 50,000 workers on Mechanical Turk, so I consider his data to be one of the most representative samples of workers.

One of the nice tasks that he has been running is a simple HIT in which he asks workers to report their location. At the same time, in this task, Dahn was recording the IP of the worker. Why the task was nice? Because there is absolutely no incentive for the workers to be truthful. The submission will be accepted and paid no matter what. In a sense, it is a test that check if workers will be truthful in cases where it is not possible to check their accuracy.

So, we used this test to check how sincere are the workers: We can simply geocode the IP address and find out the actual location of the worker. (With some degree of error, but good enough for approximation purposes.) For the workers that reported to be based in the US (approximately 22,000 workers), the HIT was asking for the zip code of the worker, making it easy to assign an approximate long/lat location.

To measure how accurately the worker report their location, we measured the distance between the location of the IP and the location of the zip code. The plot below shows the distribution of the differences:




As you can see, most of the workers were pretty truthful about their location. The difference in distance was less than 10 miles for more than 60% of the workers: this difference can be easily explained by the limited accuracy of the geocoding API's and by the approximation of using zipcode locations.

Of course, the flip side of the coin is that a significant fraction of the workers were essentially lying about their location: For 10% of the workers (i.e., ~2250 of them) the IP address was more than 100 miles away from the reported zip code. For 2% of the workers (i.e., ~500 workers) the distance was more than 1000 miles away.

The biggest lier? A worker from Chennai, India who reported a zip code corresponding to Tampa in Florida. The IP was a cool 9500 miles away from the reported location!

The Road to Serfdom, ACM Edition

<rant>

A couple of days back, I got the following email from ACM:

Dear Moderator/Chairs,

This is being sent to everyone with the chairs cc'd as the last and final requeset for the eform below to be completed or your panel overview abstract will be removed from the WWW 2011 Companion Publication and will NOT appear in the ACM DL.

Your prompt and immediate attention to the form below is needed.

permission release form URL: ....

ACM Copyrights & Permissions

Given that this was the "last and final requeset"[sic], I assumed that somehow I missed the previous requests. So, I checked my email to find out how late I was. Nope. Nothing in the archive, nothing in the trash, nothing in the spam, no entry in the delivery log. This was the first notification sent by ACM. They have just forgotten about this. But since they were running late, why not just threaten the authors? It is so much easier to pass the blame to others and be the first one to be aggressive.

What happened ACM, did you start get advice on customer service from your pals at Sheridan Printing, who tend to send requests like this?

But I should not have been so surprised. This email just reflects the overall attitude of ACM. I have experienced this many times in the past. Anyway, I decided to sign the e-form, without firing back.

Donating copyright to ACM

Signing the form was a mechanic action before. However, after reading Matt Blaze's post on copyright and academic publishing, I decided to read the form a little bit more carefully, to see exactly what I was signing.

As usual, we start with a transfer of copyright to ACM. The authors agree to transfer all their copyright rights to ACM, blah blah...

Wait a minute! Why does ACM needs to own the copyright? No good reason. To publish and distribute the article, ACM just needs a non-exclusive license to print and distribute. There is no need to own the copyright.

If we follow ACM's logic, any artist that wants to see their work exhibited in any museum, they need to give up the ownership of their work and give full ownership of their creations to the museum. For free. Without expecting any royalties back in return. Ever. Furthermore, the museum instead of promoting the work, they would lock it in a "patron members access only". For all others, the museum would demand a separate entrance ticket to show each of the collection pieces.  (Say, for a friendly price of $5 to see each painting?) .

Anyway, let's not belabor the point with copyright. We know that ACM's policy sucks. We know that ACM is a bureaucracy serving just itself and not its members or the profession. Let's move on.

Let's move to the point that really got me fired up.

Protecting ACM from liability

What got me really pissed was the last part of the agreement:

Liability Waiver

* Your grant of permission is conditional upon you agreeing to the terms set out below.

I hereby release and discharge ACM and other publication sponsors and organizers from any and all liability arising out of my inclusion in the publication, or in connection with the performance of any of the activities described in this document as permitted herein. This includes, but is not limited to, my right of privacy or publicity, copyright, patent rights, trade secret rights, moral rights or trademark rights.

All permissions and releases granted by me herein shall be effective in perpetuity unless otherwise stipulated, and extend and apply to the ACM and its assigns, contractors, sublicensed distributors, successors and agents.

So, not only we should donate "voluntarily" ownership of our copyright to ACM . We also need to protect ACM from any liability.

In other words, ACM wants to get all the upside from owning the copyright, without ever distributing royalties to the contributing authors. (Not that it would be worth much. It is a matter of principle and a signal of respect to the authors, not an issue of monetary importance.) At the same, ACM also wants the authors to provide guarantee that if there is any problem with the copyright, the author will be the one liable for the damages.

All the upside for ACM, no revenue to the authors. All the downside to the authors, no obligations for ACM.

Thank you ACM for caring so much about your members. You will not be missed when you disappear.

Yours truly,
A lifetime member of ACM.

PS: In retrospect, the title of the post is offensive: From Wikipedia's definition of serfdom: "Serfdom included the forced labor of serfs bound to a hereditary plot of land owned by a lord in return for protection". In other words, the slave owners took the product of slaves' work, but in return they provided the protection and military support, to defend the slaves that were working the land. ACM also wants the slaves to "protect the land" as well. I owe an apology to the slave owners for the comparison.

</rant>

The promise and fear of an assembly line for knowledge work

Last week, together with Amanda Michel from ProPublica, we were presenting at the CAR 2011 conference (CAR stands for Computer-Assisted Reporting), on how to best use Mechanical Turk for a variety of tasks pertaining to data-driven journalism.

We discussed issues of quality assurance, how TurkIt-like workflow-based tasks can generate nice outcomes, and briefly touched upon the CrowdForge work from Niki Kittur and the team at CMU, showing that crowdsourcing can potentially generate intellectual outcomes comparable to those of trained humans.

The discussion after the session was a mix of excitement and fear. We have observed in the past how "assembly line" work for industrial production lead to massive productivity improvements and was the basis for much of the progress in the 19th and 20th century. But that was for mechanical work. Yes, it replaced centuries old crafts of the blacksmiths, carpenters, potters, but that was just part of progress.

What happens if we see now the assembly line extended into tasks that were traditionally considered creative and intellectual in nature? What would be the effect of an assembly line for knowledge work?

A few months back, I quoted Marx and Engels who, back in 1848, wrote in their Communist manifesto:
the work of the proletarians has lost all individual character, and, consequently, all charm for the workman. ... [The workman] becomes an appendage of the machine, and it is only the most simple, most monotonous, and most easily acquired knack, that is required of him
(Btw, TIME magazine liked that connection enough to put it into their own article about Mechanical Turk.)

But how likely it is to see this style of work to be extended further in the intellectual field? Are these Mechanical Turk experiments something generalizable, or just cute proof-of-concept experiments?

I was reminded of this question today, when I realized that many intellectual tasks are already commoditized:

The article "Inside the multimillion-dollar essay-scoring business: Behind the scenes of standardized testing" gives a dreadful view of now essays are being scored for the standardized tests.

Based on the description of the article, the (human-based) scoring process "goes too fast; relies on cheap, inexperienced labor; and does not accurately assess student learning." Needless to say, the workers were not exactly enthusiastic about their work. Match that with the computer-assisted scoring of essays, and you have an MTurk-like environment for much more intellectually-demanding tasks...

After reading this essay-scoring mill story, I started feeling a little bit uneasy. The MTurk-style work seems too far away to be in my future, so the discussion is always, ahem, academic. But the essay scoring brought the concept a little bit too close for comfort.

What was the main factor for Watson's success? Hardware, software, or data?

I can think of three thinks that may have allowed Watson to win Jeopardy:
  • Hardware: From a comment at Shtetl-Optimized, "The hardware Watson was running on is said to be capable of 80 teraflops. According to the TOP500 list for November 2000, the fastest supercomputer (ASCI White) was capable of 4.9 teraflops." So, computers became 40x faster over the last 10 years. Is this the winning factor?
  • Software: A couple of months back Noam Nissan reported: "while improvements in hardware accounted for an approximate 1,000 fold increase in calculation speed over a 15-year time-span, improvements in algorithms accounted for an over 43,000 fold increase." So, maybe it is just the better NLP and machine learning algorithms that played the crucial role in this success.
  • Data: 10 years back we did not have Wikipedia, and its derivatives, such as Wiktionary, WikiQuote, Wikispecies, DBPedia, etc. Such resources add a tremendous value for finding connections between concepts. 
My gut feeling says that the crucial factor is the development of the data resources that allowed Watson to answer such trivia questions. Without discounting the importance of hardware and software development, without having such tremendously organized and rich data sources, it would not be possible for Watson to answer any of these questions. The IBM WebFountain was around for a while, but trying to structure the unstructured web data, and get meaning out of such data, is much harder than taking and analyzing the nicely organized data in DBPedia.

To paraphrase a loosely-related quote: Better data usually beats better algorithms.


Browsers of Mechanical Turk workers

Yesterday, Michael Bernstein asked on Twitter:


I recalled that my favorite go-to source for Turk statistics, Dahn Tamir, used Mechanical Turk a couple of years back to examine the connection between browser use and political orientation.

I asked Dahn if he had collected more extensive data. He is running tasks using a very large number of workers, so his sample would have been representative.

I was not disappointed: Dahn had data from approximately 19,000 workers, based on 75,000 worker requests from the last 6 months, recording the "user-agent" part of the HTTP request. For each workerid, we counted how many different user-agents we have seen. The maximum was 19, with an average value of 1.3.

Then, measured the workerid's per user-agent string. If a worker had registered multiple user-agent strings, we split the credit across browsers. For example, if the same workerid had one session with IE8 and one with IE9, then we gave 0.5 credit to IE8 and 0.5 credit to IE9.

I processed further the data using the UserAgentString API and I generated this Google spreadsheet.

After processing the data, here are the high level results:

Operating System Usage




Yep, most (~85%-90%) of MTurk workers use Windows. I found very surprising the prevalence of Windows XP!


Browser Usage


I found very interesting the relatively low percentage of IE users.

For reference, here are the most common versions for each of the most-used browsers by Mechanical Turk workers:






One of the good news is that most MTurk workers tend to use new versions of the browsers with good support for the latest web technologies (css, javascript, etc). Interestingly, though, a very significant fraction uses Windows XP!

In the future, we may repeat the measurements by also keeping statistics about Javascript versions, plugins, flash support etc.

Until then, enjoy and code safely, knowing that you do not have to support IE6 and IE7 in your MTurk HITs.

3rd Human Computation Workshop (HCOMP 2011), San Francisco, August 7 or 8

I am just posting this here, to build some awareness about HCOMP 2011, the 3rd Human Computation Workshop, which will be organized together with AAAI in San Francisco, on August 7 or 8. You can also find more detailed information about the workshop at http://www.humancomputation.com. The submission deadline is April 22 April 29th.

Human Computation is the study of systems where humans perform a major part of the computation or are an integral part of the overall computational system. Over the past few years, we have observed a proliferation of related workshops, new courses, and tutorials, scattered across many conferences.

In this 3rd Human Computation Workshop (HCOMP 2011), we hope to draw together participants across disciplines -- machine learning, HCI, mechanism and market design, information retrieval, decision-theoretic
planning, optimization, computer vision -- for a stimulating full-day workshop at AAAI in the beautiful San Francisco this summer. There will be presentation of new works, lively discussions, poster and demo sessions, and invited talks by Eric Horvitz, Jennifer Wortman and more. There will also be a 4-hour tutorial called "Human Computation: Core Research Questions and State of the Art" at AAAI on August 7, which will give newcomers and current researchers a bird�s eye view of the research landscape of human computation.




Call for Papers

3rd Human Computation Workshop (HCOMP 2011)
co-located with AAAI 2011
August 7 or 8, San Francisco, CA
http://www.humancomputation.com

Human computation is a relatively new research area that studies how to build intelligent systems that involves human computers, with each of them performing computation (e.g., image classification, translation, and protein folding) that leverage human intelligence, but challenges even the most sophisticated AI algorithms that exist today. With the immense growth of the Web, human computation systems can now leverage the abilities of an unprecedented number of Internet users to perform complex computation. Various genres of human computation applications are available today, including games with a purpose (e.g., the ESP Game) that generates useful data through gameplay, crowdsourcing marketplaces (e.g., Amazon Mechanical Turk) that coordinate workers to perform tasks for monetary rewards, and identity verification systems (e.g. reCAPTCHA) that generate useful data through users performing computation for access to online content.

Despite the variety of human computation applications, there exist many common core research issues. How can we design mechanisms for querying human computers in such a way that incentivizes or encourages truthful responses? What are the techniques for aggregating noisy outputs from multiple human computers? How do we effectively assign tasks to human computers to match their particular expertise and interests? What are some programming paradigms for designing algorithms that effectively leverage the humans in the loop? How do we build human computation systems that involve the joint efforts of both machines and humans, trading off each of their particular strengths and weaknesses? Significant advances on such questions will likely need to draw many disciplines, including machine learning, mechanism and market design, information retrieval, decision-theoretic planning, optimization, human computer interaction, etc.

The workshop recognizes the growing opportunity for AI to function as an enabling technology in human computation systems. At the same time, AI can leverage technical advances and data collected from human
computation systems for its own advancement. The goal of HCOMP 2011 is to bring together academic and industry researchers from diverse subfields in a stimulating discussion of existing human computation applications and future directions of this relatively new subject area. The workshop also aims to broaden the scope of human computation to more than the issue of data collection to a broader definition of human computation, to study systems where humans perform a major part of the computation or are an integral part of the overall computational system.

Topics

Topics of interest include, but are not limited to:

  • Programming languages, tools and platforms to support human computation
  • Domain-specific challenges in human computation
  • Methods for estimating the cost, reliability, and skill of labelers
  • Methods for designing and controlling workflows for human computation tasks
  • Empirical and formal models of incentives in human computation systems
  • Benefits of one-time versus repeated labeling
  • Design of manipulation-resistance mechanisms in human computation
  • Concerns regarding the protection of labeler identities
  • Active learning from imperfect human labelers
  • Techniques for inferring expertise and routing tasks
  • Theoretical limitations of human computation


Format

The workshop will consist of several invited talks from prominent researchers in different areas related to human computation, selected presentations of technical and position papers, as well as poster and demo sessions, organized by theme.

Submission

Technical papers and position papers may be up to 6 pages in length, and should follow AAAI formatting guidelines. For demos and poster presentations, authors should submit a short paper or extended abstract (up to 2 pages). We welcome early work, and particularly encourage submission of visionary position papers that are more forward looking. Papers must be submitted electronically via CMT. The submission deadline is April 22, 2010.

Workshop Website

For more details, please consult our workshop website.

Organizers

Luis von Ahn (co-chair)
Panagiotis Ipeirotis (co-chair)
Edith Law
Haoqi Zhang
Jing Wang

Program Committee

Foster Provost
Winter Mason
Eric Horvitz
Ed Chi
Serge Belongie
Paul Bennett
Jennifer Wortman
Yiling Chen
Kristen Grauman
Raman Chandrasekar
Rob Miller
Deepak Ganesan
Chris Callison-Burch
Vitor R. Carvalho
David Parkes

 
Free Host | lasik surgery new york