Policy questions

From PublicWiki
Revision as of 05:01, 30 May 2006 by Yanokwa (talk | contribs)

Jump to: navigation, search
Is search a public service to a public good (web)?
  -> should it be regulated as such?
  -> some moves for nationalized search engines
  -> analogy of search index to local phone companies open lines to long distance companies
Cross-national position of search engines
  -> exporting censorship
  -> accountability of search companies across borders? (kinda talked about already, with respect  to china and server location)
  -> indexing content & fair use
  -> primary and secondary liability for copyright infringement
Aggregation of Information and Privacy (the big one)
   -> laws against selling data of particular formats?
   -> making data/privacy policies clear
                 - tied to web browser?
                 - http://www.w3.org/P3P/
   -> how should data be scrubbed?
            - transparency of scrubbing procedure

   According to Chris Hoofnagle (from http://islandia.law.yale.edu/isp/search_papers/hoofnagle.pdf):

   "A privacy dialogue concerning search histories would include:
   • A discussion of identification: do search engines need to identify users? Can they
   do so pseudonymously? In personalized search situations, are there ways to
   divorce identification data from historical search results (that is, by applying a
   profile to the user rather than use that individual's specific search history for
   • A discussion of data retention: how long is too long?
   • A discussion on government access: what barriers (legal and technological) are
   there to law enforcement access to personal information collected by search
   engines? What will prevent a search engine from making law enforcement
   requests for data too easy (as AOL did by establishing data retention policies
   favorable to law enforcement, and by waiving service obligations to speed
   government access to data)?
   • A discussion on commercialization of data: what limits are there are the resale of
   search engine history information?"

Themes to keep in mind during discussion:
   --transparency of search algoritms
   --readability of privacy/data policies

Transparency Google collects a lot of data on each person. From what email you get (school information, prescriptions, things you buy), your address book, your notebook, your calendar, your searching habits, where you go, your location, etc. Aggregate this with what your friends do, Google can build a very good idea of who you are and it is safe to assume that Google does not throw away any such information.

The problem then becomes the current trend of personalizing search. Google is a platform that custom services are built so as search becomes more personalized, so will these custom services.

Just the fact that I search for "Python" is incredibly revealing if I click on links relating to Computer Science or Snakes. This small bit of information if made accessible (even accidentally) can be used by 3rd parties to gather information about users. While it's fine for Google to keep their search algorithms closed, perhaps there should be transparency about the interfaces they present to 3rd parties.

Question: Is disclosure through transparency something that should be required of companies that mine this much information?