IEG For Kannada Wikipedia

About an year ago, we found that it is almost becoming difficult/worse to search the Kannada books on DLI (Digital Library Of India) and OUDL (Osmania University Digital Library).

Reasons:

  • Index found in these digital libraries used latin letters for book names i.e. in Kanglish. It was not easy to type any book name and find them due to the transliteration rule used to come up with these names Ex:
    Aitihaasika Kathaaval’i Erd’aneya Bhaaga.
  • As the names were not found in Unicode/Native language more than 5000 books in these libraries were hidden from common people who would have loved to read them/use them for research, learning etc.
  • Even if we found the the book, it was difficult to read the books as DLI needed a special plugin to view tif files on the browser. Luckily OUDL had books in PDF format.

To view the Books Online Download Alternatiff plugin for Windows Users and Plugger plugin for Linux Users(install GTK,GLIB prerequisites)

These reasons were more than enough to encourage ourselves to do something around this. We started a small project to pull the metadata of these libraries and see if we can do something about the books.

We initiated a crowdsourcing project on our community portal http://samooha.sanchaya.net, pushed the metadata that Pavithra had collected from the DLI & OUDL websites and requested people to help us transcribe the book name, publisher name and author names to Kannada in Unicode. Following video demonstrates how crowd sourcing helped us get the metadata transformed into useful information quickly.

We had fun working on the above interface ourself as it was fun seeing the donut graphs turning blue while we downloaded many books which we found interesting or which we had missed to read all these days. It was a same experience for others who joined us from various countries to help us with this crowd sourcing activity. Thanks to Devaraj (Devu) who helped quickly put this module live.

Apart from getting the data, we also worked on getting it validated. More or less same set of people helped going through the books once again to see that we don’t have incorrect information for the metadata.

Now, it was time to let people search these books on Internet without much effort. Atleast they can find the Book names, authors and publishers in Kannada. We made this possible via http://pustaka.sanchaya.net

pustaka_sanchaya-1080x607

Above image is just a snapshot. 5000+ books being searched on this platform from 100+ categories.

We had admitted for a while that the work is done, but the queries followed our tweets and blog posts on the above projects. We discovered more issues. People desperately wanted to download some of these books, find some which had different author names, some books did not have the right information as printer name was used as publisher’s name and so on…

Apart from that we had more technical issues which we did over see while working on metadata and other things in early stage of this project. i.e availability of the digital library portals round the clock. DLI & OUDL used to fail serving the pages or files intermittently. OUDL library used to go offline after 5pm just like the government offices shutdown. Writing to concerned people never yielded any results.

I’m writing this whole story because we believe  that the work of making these books accessible to all Kannadigas is NOT‌ OVER. We need more work to be done. We thought we should have wiki pages for all these books. Request people to join us writing more about the books, about the authors and the printing presses, publishers. Fellow wikimedians also heard about our idea and helped us write a plan for the same. As a result, I ended up submitting our project plan to Wikimedia Foundation’s Individual Engagement Grant.

Please find the detailed IEG project plan here: Growing Kannada-language Wikimedia projects with a digital library

We have been lucky have been chosen to work on this project along with the other 13 participants who submitted excellent ideas. Read more about the other IEG projects here on Wikimedia Blog.

We are now gearing up to work on the project in full swing. This IEG requires us to teach new wikipedians/wanna be wikipedians to learn about Wiki Editing while working on making the Kannada Books from Digital libraries searchable/made available for learning/research etc.

It’s a challenge to make this happen as it again involves crowdsourcing activities in various stages. Involves understanding how we can control the mass editing on wikipedia while taking care of the edit quality and more.

Looking forward for lots of support and encouragement for us here.

Technology for Conserving Language: Presented at Kannada Sahitya Sammelana

81th Akhila Bharata Kannada Sahitya Sammelana (Jan 31st – 3rd Feb) happening here in Shravana Belagola, Hasana District, Karnataka. I was fortunate to have got invited to talk about Information technology for Language chaired by Dr. Chidananda Gowda. Dr. U.B Pavanaja and Ram Prakash H from Tachyon Technologies (Quillpad) presented on Kannada Wikipedia and Kannada OCR respectively.

In-spite of unavailability of technology solutions such as projector, internet connectivity etc., I could say we did our best to explain how technology can make difference to the way we look at language and how we can work together to save from reaching extinction.

Here are my slides in which I have tried to explain challenges of Kannada Language survival and solutions explored, invented and developed by few of us. First few slides talk about the carvings found in our ancient sculptures and how Kannada Computing Expert Shri K.P Rao’s Kadamba & Chalukya Fonts helping us to be able to continue researching on these. His work around ‘Apara‘ font which helps the non-kannadiga’s or those who cannot read Kannada scripts to read Kannada in their own language was also an highlight.

Inaccessible sources of data, government websites, libraries and physical status of many archives across country were the source of inspiration to think of mirroring our literary treasure on Internet. Kannada Sanchaya projects such as Vachana Sanchaya, Samooha Sanchaya, Pustaka Sanchaya and others were highlighted to showcase why we need to ensure that the literature should be made available in internet using Unicode and other standards to enable the researchers, students and also common people to experience and continue researching on it forever.

I spoke about the support we got from FOSS (Free and Open Source Software) and Free Culture which made the work around Kannada Sanchaya, Wikipedia etc. possible. I also recalled how FOSS helped us make digital libraries reachable/usable to common man. Samooha sanchaya had just done over a week to complete its first milestone of having transliterated 2252 books and the same books can be searched at http://pustaka.sanchaya.net directly in Kannada. “Fuel Project‘ (FUEL – Frequently used entries for localization) which helps us get the community consensus to standardize the localization efforts for language was mentioned how FOSS communities can set standards and also ensure the language projects can take the best out of the resources.

Mobile solutions built by many Kannada IT professionals, Google Transliteration and Guesture search app capability of identifying Kannada on Android, Mozilla Firefox browser and Firefox OS phone were also the highlights of my talk to help understand where we stand today with mobile technology. Also the work that needs more attention here on.

My talk ended with a highlight on the Open Knowledge initiatives, Wikipedia Contributions, Books re-licensed under creative commons by Government of Karnataka’s Cultural department, Niranrajan’s works by Tejaswini Niranjana, and Mysore University etc.

Kannada should never be on endangered language list and It is possible for us to enrich the language for next generation through knowledge sharing, collaboration and by motivating each other by acting responsibly.

Here are my slides in Kannada. Hope I have tried my best to summarize the entire talk in English for my other friends who can’t read and understand Kannada. I shall make my talk transcript available online shortly.

Thank you,


Conserving Linguistic Heritage the FOSS way

Presentation on http://vachana.sanchaya.net to digitize and build linguistic research tool for Kannada. Presented at Swatantra 2014 – Fifth International Free Software Conference, Kerala on 19th December 2014.

Event Page: http://icfoss.org/fs2014/program_details.html#Wikipedia/Wikimedia

Presentation Deck:

Arivina Alegalu on Samyukta Karnataka

 ‘Arivina Alegalu’ first Kannada e-book on FOSS/Science/Technology released under OpenContent project by Sanchaya team in 2011 to celebrate the freedom of FOSS during Indian independence day celebrations. The project aims to make technology reachable to common man in simple language through people who love language and technology. A young team of writers/technology users started contributing to this project enthusiastically and in 2012 it sees its second release.

Being online still doesn’t help us fulfill our dream of reaching common man. Hence, we always tried finding options to reach out outside internet.

Today, it was an another milestone accomplishment of this project by getting itself printed on ‘Samyukta Karnataka’ Kannada news daily… (Find its online edition here – http://epaper.samyukthakarnataka.com) and secondly, Arivina Alegalu becomes the first e-book under ‘Creative Commons’ CC-By-SA-NA to be printed on any Kannada news daily. We are expecting this project to reach millions of people, inspire and encourage more Kannadiga’s to start contributing technical, scientific and technological writing to literature in one or the other ways.

Kudo’s to entire team and contributors on this occasion.