Python Project to Convert PDF File Text to AudioBook & Speech to PDF

Have you ever wished to listen to any book or speak the text rather than typing? We are here with a project that can convert the text in PDF into speech and speech into a PDF format. This project not only eases the work of the typing but also gives the pleasure to listen to your favorite books. So, let’s start to know more about this Python Project to Convert PDF Text to Audio Speech and Vice Versa.

About PDF Text to Speech and Speech to PDF Project

In this project, the user gets two options, to convert PDF text to speech and vice versa. For the first case, the user enters the start and end page, then selects the PDF to listen to. And for the latter case, the user enters the path where the PDF needs to be saved and chooses the mp3 or wav file to convert to PDF.

PDF to Speech and Speech to PDF Project in Python

To build this project in Python, we use the Tkinter, Path, PyDub, PyPDF4, Pyttsx3 and SpeechRecognition libraries in Python. We will build a GUI using tkinter, open the files in the device using Path, read PDF and convert to audio form using PyDub and PyPDF4, and convert audio file to PDF form using PyTtsx3 and SpeechRecognition.

Project Prerequisites

It is requested that the user has prior knowledge in Python and basic ideas about the Tkinter module. All the above mentioned modules can be installed using the following commands.

pip install tkinter
pip install path
pip install pyttsx3
pip install pydub
pip install PyPDF4
pip install SpeechRecognition

Download Python Project to Convert PDF Text to Audio Speech here

Please download the PDF to Speech and Speech to PDF Project in Python using the link: Python Text to Audio Speech & Vice versa

Project File Structure

Steps to build the Python Project to Convert PDF Text to Audio Speech are:

Importing the required modules
Writing the function to read the text from the selected PDF
Creating GUI to for the conversion of PDF to audio form
Writing function to write the audio into PDF
Creating GUI to for the conversion of audio to PDF
Creating the main window

1. Importing the required modules

The first step is to import all the modules.

#Importing modules for Python Project to Convert PDF Text to Audio Speech
import tkinter
from tkinter import filedialog

from path import Path
from PyPDF4.pdf import PdfFileReader , PdfFileWriter
import pyttsx3
from speech_recognition import Recognizer, AudioFile
from pydub import AudioSegment
import os

Code explanation:

a. Tkinter: This module helps in creating the GUI
i. filedialog: This helps in getting the path based on the file selected by the user
b. Path: This module helps in extracting the path of the file
c. PyPDF4: This module helps in reading and writing the text in the PDF
d. Pyttsx3: This module helps in conversion of text to audio
e. Speech_recognition: This module helps in extracting the audio
f. Pydub: This modifies the audio file
g. Os: This module helps in interacting with operating system

2. Writing the function to read the text from the selected PDF

#Declaring global variables related to PDF to Speech conversion
global end_pgNo ,start_pgNo

#Function to open the PDF selected and read text from it
def read():
path = filedialog.askopenfilename() #Get the path of the PDF based on the user's location selection
pdfLoc = open(path, 'rb') #Opening the PDF
pdfreader = PdfFileReader(pdfLoc) # Creating a PDF reader object for the opened PDF
speaker = pyttsx3.init() #Initiating a speaker object

start=start_pgNo.get() # Getting the starting page number input
end=end_pgNo.get() # Getting the ending page number input

#Reading all the pages from start to end page number
for i in range(start,end+1):
page = pdfreader.getPage(i) #Getting the page
txt = page.extractText() #Extracting the text
speaker.say(txt) #Getting the audio output of the text
speaker.runAndWait() # Processing the voice commands

Code explanation:

a. In the read() function, first we extract the path based on the user’s selection and then open the PDF file.
b. Then the PdfFileReader object and initiate speaker object
c. Get the starting and ending page numbers and read the pages in the given range

3. Creating GUI to for the conversion of PDF to audio form

#Function to create a GUI and get required inputs for PDF to Audio Conversion
def pdf_to_audio():
#Creating a window
wn1 = tkinter.Tk()
wn1.title("PythonGeeks PDF to Audio converter")
wn1.geometry('500x400')
wn1.config(bg='snow3')

start_pgNo = IntVar(wn1) #Variable to hold the starting page number
end_pgNo = IntVar(wn1) #Variable to hold the ending page number

Label(wn1, text='PythonGeeks PDF to Audio converter',
fg='black', font=('Courier', 15)).place(x=60, y=10)


Label(wn1, text='Enter the start and the end page to read. If you want to read a single \npage please enter the start page and enter the next page as the end page:', anchor="e", justify=LEFT).place(x=20, y=90)

#Getting the input of starting page number to be spoken
Label(wn1, text='Start Page No.:').place(x=100, y=140)


startPg = Entry(wn1, width=20, textvariable=start_pgNo)
startPg.place(x=100, y=170)

#Getting the input of ending page number to be spoken
Label(wn1, text='End Page No.:').place(x=250, y=140)


endPg = Entry(wn1, width=20, textvariable=end_pgNo)
endPg.place(x=250, y=170)

#Button to select the PDF and get the audio input
Label(wn1, text='Click the below button to Choose the pdf and start to Listen:').place(x=100, y=230)
Button(wn1, text="Click Me", bg='ivory3',font=('Courier', 13),
command=read).place(x=230, y=260)

wn1.mainloop()

Code explanation:

a. In the pdf_to_audio() function, we create a GUI to take the input of starting and ending page numbers
b. Then, on clicking the button the PDF is selected from the device and the text in that PDF is converted to audio form

4. Writing function to write the audio into PDF

#Declaring global variable for PDF to Speech conversion
global pdfPath

#Function to update the PDF file with the text, both given as parameters
def write_text(filename, text):
writer = PDFwriter() #Creating a PDF writer object
writer.addBlankPage(72, 72) #Creating a blank page
pdfPath = Path(filename) #Getting the path of the PDF
with pdf_path.open('ab') as output_file: #Opening the PDF
writer.write(output_file) #saving the text in the writer object
output_file.write(text) #writing the text in the PDF

#Function to convert audio into text
def convert():
path = filedialog.askopenfilename() #Getting the location of the audio file
audioLoc = open(path, 'rb') #Opening the audio file

pdf_loc=pdfPath.get() #Getting the path of the PDF

#Getting the name and extension of the audio file and checking if it is mp3 or wav
audioFileName = os.path.basename(audioLoc).split('.')[0]
audioFileExt = os.path.basename(audioLoc).split('.')[1]
if audioFileExt != 'wav' and audioFileExt != 'mp3':
messagebox.showerror('Error!', 'The format of the audio file should be either "wav" and "mp3".')

#If mp3 file converting it into wav file
if audioFileExt == 'mp3':
audio_file = AudioSegment.from_file(Path(audioLoc), format='mp3')
audio_file.export(f'{audioFileName}.wav', format='wav')
source_file = f'{audioFileName}.wav'

#Creating a recognizer object and converting the audio into text
recog= Recognizer()
with AudioFile(source_file) as source:
recog.pause_threshold = 5
speech = recog.record(source)
text = recog.recognize_google(speech)
write_text(pdf_loc, text)

Code explanation:

a. In the write_text() function, first we create a PDFwriter object and update the PDF with the text
b. In the convert() function, we open the PDF and audio file and convert the audio into text using the Recognizer object. Then call the write_text() function.

5. Creating GUI to for the conversion of audio to PDF

#Function to create a GUI and get required inputs for Audio to PDF Conversion
def audio_to_pdf():
#Creating a window
wn2= tkinter.Tk()
wn2.title("PythonGeeks Audio to PDF converter")
wn2.geometry('500x400')
wn2.config(bg='snow3')

pdfPath = StringVar(wn2) #Variable to get the PDF path input

Label(wn2, text='PythonGeeks Audio to PDF converter',
fg='black', font=('Courier', 15)).place(x=60, y=10)

#Getting the PDF path input
Label(wn2, text='Enter the PDF File location where you want to save (with extension):').place(x=20, y=50)
Entry(wn2, width=50,textvariable=pdfPath).place(x=20, y=90)

Label(wn2, text='Choose the Audio File location that you want to read (.wav or .mp3 extensions only):',
fg='black').place(x=20, y=130)

#Button to select the audio file and do the conversion
Button(wn2, text='Choose', bg='ivory3',font=('Courier', 13),
command=convert).place(x=50, y=170)
wn2.mainloop() #Runs the window till it is closed

Code explanation:

a. In the function, we create a GUI to get the PDF path input.
b. Then on clicking the button, the audio file is selected for conversion purpose

6. Creating the main window

#Creating the main window
wn = tkinter.Tk()
wn.title("PythonGeeks PDF to Audio and Audio to PDF converter")
wn.geometry('700x300')
wn.config(bg='LightBlue1')

Label(wn, text='PythonGeeks PDF to Audio and Audio to PDF converter',
fg='black', font=('Courier', 15)).place(x=40, y=10)

#Button to convert PDF to Audio form
Button(wn, text="Convert PDF to Audio", bg='ivory3',font=('Courier', 15),
command=pdf_to_audio).place(x=230, y=80)

#Button to convert Audio to PDF form
Button(wn, text="Convert Audio to PDF", bg='ivory3',font=('Courier', 15),
command=audio_to_pdf).place(x=230, y=150)

#Runs the window till it is closed
wn.mainloop()

Code explanation:

a. In the above code, we create the main window with two button
b. One for PDF to Audio conversion by calling the pdf_to_audio() function
c. And the other for Audio to PDF conversion by calling the audio_to_pdf() function

Python PDF Text to Audio Speech and Speech to PDF Output

The image of the python PDF text to Audio speech converter GUI

Summary

Hurray! We have successfully built the PDF to audio and audio to PDF converter. In the project, we got introduced to various modules like SpeechRecognition, including Tkinter. Hope you enjoyed building this project with us!

Python Project to Convert PDF File Text to AudioBook & Speech to PDF

About PDF Text to Speech and Speech to PDF Project

PDF to Speech and Speech to PDF Project in Python

Project Prerequisites

Download Python Project to Convert PDF Text to Audio Speech here

Project File Structure

1. Importing the required modules

2. Writing the function to read the text from the selected PDF

3. Creating GUI to for the conversion of PDF to audio form

4. Writing function to write the audio into PDF

5. Creating GUI to for the conversion of audio to PDF

6. Creating the main window

Python PDF Text to Audio Speech and Speech to PDF Output

Summary

1 Response

Leave a Reply Cancel reply