Liberating voter rolls in memory of Aaron Swartz

Over 40 geeks get together for a night of understanding Aaron Schwarz's work and contributing to keep his memory and projects alive.

Arun Raghavan, an open source software enthusiast, and four friends worked all night on January 19th on a very unique problem. They were scraping electoral data from ceokarnataka‘s website. They wanted to create a user-friendly frontend for citizens to search their names and polling booth information. They did this as part of a hacknight to commemorate the life and works of Aaron Swartz, on January 19th and 20th organised by HasGeek, an event organiser for geeks.

Aaron Swartz was a hacktivist, who died in early January. He had helped create RSS 1.0; contributed to Creative Commons; was an early builder of Reddit, where he’s often acknowledged as a co-founder; and more recently, became a data liberator, which got him into trouble with the law.

Aaron Swartz is gone, but his work on making the world a better place should not die with him, was the idea behind the hacknight. The idea was to understand his work, issues such as IT laws, copyright rules and access to information and contribute to keep Schwartz’s memory and projects alive.

Pic courtesy:HasGeek

Swartz had initiated several coding projects during his lifetime. Anand Chitipothu, Bangalore-based developer who collaborated with Swartz at the Internet Archive and maintains his framework, suggested that the hacknight could also be an opportunity where people get familiar with Aaron’s coding projects and work on some of them. 

Around 40 people participated. Some participants proposed projects to liberate different kinds of public data such as electoral dataweather data, information about train timetables and crawling data from government and NIC websites. Developers worked on these projects to make the data searchable and usable. 

Discussions during the hacknight: The hacknight started at 3 PM with a discussion about the life of Aaron Swartz and the political and legal implications of his coding projects and activism. This discussion was led by Chitipothu and Kiran Jonnalagadda of HasGeek.

Schwartz had started freeing data funded by public money which constitutionally belonged in the public domain. He published data from the catalogue of the Library of Congress and the US case law archives on the Internet Archive. Later, Aaron downloaded articles from JSTOR to release academic papers whose research was funded with public money. Before he could sift through the downloads, Aaron was caught by the police. He returned the hard disk containing the downloads. JSTOR and MIT did not pursue cases against him, but the United States government charged Aaron for breaking into the MIT campus and faking identity by changing the MAC address of his computer.

At the end of Jonnalagadda’s presentation, participants asked several questions about activism, what constitutes offensive speech, framework of IT laws in India, and the process of law-making.

Kiran Jonnalagadda and Anand Chitipothu at the Hacknight. Pic courtesy:HasGeek

Sunil Abraham of the Centre for Internet and Society (CIS) also joined the hacknight. He made a presentation about copyright laws, the Indian IT Act and Schwartz’s work. After Sunil’s presentation, there was a half hour discussion about the scope of copyright laws in India, copyright exemptions and what constitutes copyright infringement. Participants agreed that the trouble lies with the broad interpretations of copyright and IT laws. This enables the state and private parties to target and harass a person, often on frivolous grounds.

Hacknight projects

At 6 PM, participants with project ideas and those who wanted to join projects formed groups

A complete list of projects that participants worked on during the hacknight are available on the hacknight website. We talked with some of the teams and individual participants to understand their projects, the process they followed for solving the problems, and outcomes at the end of the hacknight.

Liberating electoral data: Arun Raghavan, an open source enthusiast, and four other participants (Arun K, Praveen, Mikul and Sumant) worked on scraping electoral data from They planned to build a frontend which will make it easy for users to search their names and polling booth information. Currently, the electoral roll is published as a PDF document for each polling station along with a search form (which is unreliable and fails often) for individuals to find their names on the roll and the location of their polling station. 

It was difficult to parse the data because the PDFs were not designed for machine readability. Hence, the team had to spend time understanding how to extract the text. The other problem was that the person’s name was written above the father’s name, but if the person’s name was very long, it overlapped the father’s name. This made it difficult to determine where the person’s name ended and where the father’s name began. The team managed to come up with a heuristic to distinguish between the person’s name and father’s name based on slight differences in the way the text was printed on each sheet.

At the end of the hacknight, the group almost managed to get a dump of an entire electoral roll. The project repositories: ceoscraper and  ceo-kar-roll-scraper

Other data liberation projects:

  1. Indexing Government websites by category of information: Elvis D’souza worked on crawling government websites and indexing them by category, for e.g., education, import-export trade, science and technology, etc. According to him, government websites contain lots of information including documents and spreadsheets. At the hacknight, Elvis completed the indexing process and ran some statistics about information contained in these websites. He eventually wants to build a portal where people can access this index and the documents.

  2. Railway timetable data: Anand scraped data from the IRCTC website. Supreeth Srinivasmurthy worked with this data to plot a map. Bibhas Debnath also worked on the timetable data to b
    uild an API. A demo of this API is yet to be released.

  3. Parsing weather data: Asok Padda converted weather data from HTML format to Excel sheets. Hourly weather data for all weather stations in India during 2012 is parsed and uploaded to Internet Archive: Other projects: Kashyap Kondamundi started building an app which will help people to calculate the current values of their mutual funds. He built 70% of this app at the hacknight.


  1. irene ogrizek says:

    I have written an article about Aaron Swartz’ and Hamed Al-Khabaz’ internet activism. I believe our ideas about heroism are changing because of the internet. Here is the link:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Similar Story

Domestic violence in resettlement areas: Community workers bear the burden

Community workers, who are the first respondents to attend domestic violence cases in Chennai's resettlement areas, face innumerable challenges

As Priya* woke up at 5:30 am, she took the final sip of her coffee and was about to begin her morning prayers when she received a call from an unknown number. A few years ago, she wouldn't have bothered to answer. But now, as a community worker in a resettlement site, calls from unfamiliar numbers have become a routine part of her daily life. A woman could be heard crying at the other end. Priya asked her to calm down and speak clearly. The woman informed her that her husband was beating her up and had locked her inside…

Similar Story

Addressing pet dog attacks: A balance between regulation and compassion

Government intervention is necessary to prevent indiscriminate breeding and trade of pet dogs, and more shelters are needed for abandoned pets.

Recently, two pet Rottweiler dogs attacked a five-year-old child and her mother in a  Corporation park in Nungambakkam, Chennai. Based on a complaint following the incident, police arrested the owners of the dog for negligence and endangering the lives of others (IPC Section 289 and 336). As General Manager-Administration of the Blue Cross of India, I have seen several Rottweilers over the years. While there are laws to address such situations, there needs to be adequate awareness among pet owners that dogs like Rottweilers should be taken for a walk only on a leash. A major portion of the responsibility…