AnkiDroid audio with Python and Google

I've been learning Czech recently. My Anki cards have my native tongue German on the front and Czech on the back. When listening to Czech audio I really took a long time to translate simple words or grapple with the conjugation of Czech verbs. The remedy to this situation appears easy: Create a deck with Czech audio and text on the front and German on the back of the card.

Anki is an open-source product and it's not too hard to dig up some info on how the Anki 2 database is formatted. An article by Julian Sobzak here explains the content of the .apkg format. It's a zip file with raw audio files (usually MP3) in a folder and an Sqlite3 database that contains all the text. This article also lists all the database fields you need to extract the info from the cards.

Getting the info from the database is conceptually simple, if you unzipped the apkg to the local directory and now have Sqlite database collection.anki2 in the local directory.

conn = sqlite3.connect('collection.anki2')
c = conn.cursor()

for row in c.execute('''SELECT guid, flds from notes'''):
    guid = hex(hash(row[0]))
    deu, cze = row[1].split('\x1f')

The flds column contains the fields of the notes (more on that later). Split flds by the char 0x1f and you'll have split it into it's constituent parts. Here I hash the GUID of the card to create a string that will serve as the file name.

There is another project named genanki by Kerrick Staley over on Github. This Python 3 library will help you generating a new deck from the newly created data. If you're familiar with using AnkiDroid, the only thing that'll be new to you will be the notion of a Model. This serves as a template, a Note is the actual content that is passed into the template at the time it is rendered. So we'll have a generic card that takes two parameters, the front and the back content.

deck = genanki.Deck(2059400110, 'Tschechisch Deutsch (Talki)')
package = genanki.Package(deck)
model = genanki.Model(
    1607392319,
    'Simple Model',
    fields=[
        {'name': 'Czech'},
        {'name': 'German'},
    ],
    templates=[
        {
            'name': 'Card 1',
            'qfmt': '{{Czech}}',
            'afmt': '{{FrontSide}}<hr id="answer">{{German}}',
        },
    ])

for ... :
    ...
    mediafiles.append('{}.mp3'.format(guid))
    my_note = genanki.Note(
        model=model,
        fields=['{}, [sound:{}.mp3]'.format(cze, guid),
                deu])
    deck.add_note(my_note)
    
package.write_to_file('output.apkg')

Finally we need a Text-to-Speech system to render the Czech words. Czech is not universally available from TTS providers. Notably AWS Polly can not do Czech at this point in time. Google GCE can do Czech output, but you can't get a GCE account without a company if you're from Europe.

Fortunately there's a way around that: The excellent gTTS by Pierre-Nick Durette enables you to dump MP3 output directly from Google Translate.

    tts = gTTS(cze, 'cs')
    with open('{}.mp3'.format(guid), 'wb') as f:
        tts.write_to_fp(f)

That's it. The full raw script is over at Github in my scratch repository.

Feel free to adapt.