The wise men say that no matter how busy you are, you should find time to read a good book to avoid self-inflicted ignorance. As software developers, we know the reality of this statement, but time is never enough for us. This is why we run for audiobooks which we all know their prices do not come cheaply or free.
In this tutorial, we will learn how to create an audiobook converter which we will use to convert PDF files to their audiobooks equivalent using Python libraries.
To build our audiobook converter, we will use the following Python libraries:
tkinter. The latter
tkinter will be used to create a dialog window through which we will use to select our desired PDF files.
We will look deeper into the other libraries as we continue.
To understand this article, a reader needs to have:
- A good understanding of Python.
Python3installed on the computer.
Python text to speechlibrary version 3 installed.
Python PDFversion 3 installed.
- A good understanding of
What is PyPDF3?
From PyPDF3’s official documentation, it is a pure-python library built as a PDF toolkit. It was built to help with the following:
- Extracting document information (title, author, etc.).
- Splitting documents page by page.
- Merging documents page by page.
- Cropping pages.
- Merging multiple pages into a single page.
- Encrypting and decrypting PDF files.
It is important to note that since this library is built from a file page perspective, it has problems manipulating pdf files that are not correctly page numbered.
It can be used as a tool for websites that manage or manipulate PDFs.
What is Pyttsx3?
From the Pyttsx3’s official documentation, Python Text to Speech version 3 (
pyttsx3) is a text-to-speech conversion library in Python. Unlike alternative libraries, it works offline and is compatible with both Python 2 and 3.
It can be applied in desktop and mobile applications and websites to convert text to speech for the visually impaired.
It can be used to create an audiobook.
We’ll install the required packages for this tutorial in a virtual environment. A virtual environment helps with our project management. For more information on virtual environments, look into this article.
We will use
pipenv to create our virtual environment.
pipenv if you don’t have it installed on your machine by running the following command:
pip install pipenv
Then to create the virtual environment, let’s install the required packages by running the following commands:
pipenv install pyttsx3 pipenv install PyPDF3
We then activate the environment with the following command:
After this, we are ready to write our code as you will see below.
Writing our code
While inside our virtual environment, let’s create a file and name it
main.py. Afterward, write the following code into it:
# We first import the libraries we just installed import pyttsx3 import PyPDF3 from tkinter import Tk # tkinter comes pre-installed with Python from tkinter.filedialog import askopenfilename Tk().withdraw() # prevents the root window from appearing # Open the dialog window file = askopenfilename() # Read the name of the pdf file from the user pdfreader = PyPDF3.PdfFileReader(file) # Read the number of pages in the pdf file pages = pdfreader.numPages # Read all data from each page of the pdf file for no in range(0,pages): page = pdfreader.getPage(no) text = page.extractText() audio = pyttsx3.init() audio.say(text) # Save the audio in an mp3 file '''Make sure to include the `save_to_file` method after the `say` method to get to record the audio of your book.''' audio.save_to_file(text, 'myaudiobook.mp3') audio.runAndWait()
After running the code above, you’ll see a dialog window pops up:
Select the PDF file of your choice and enjoy your book as your machine reads it to you.
Feel free to have a preview of the results on my replit.
Note: Not all PDF files will be read through and recorded. Try using the unnumbered pages PDF files for better results.
You can explore more with Python text to speech library to be able to change voice, change the rate of speech, and the volume of speech. For more information on PyPDF3 library, you can read this documentation.
It’s time those stacked up books that you got on your “to-read list” got exhausted with your own made audiobook converter.
Peer Review Contributions by: Willies Ogola
About the authorAdhinga Fredrick
Fredrick is a Computer Technology student with interests in Python for Web development, Machine Learning, and Data Science. When away from the computer, he loves nature walks and learning new things.