Convertire documenti con Python

Convertire documenti con Python Convertire documenti con Python

Convertire documenti con Python

Versione italiana

Di seguito vedremo come convertire documenti in modo facile e veloce con Pyhton!

1. LibreOffice/UNO (unoconv) - La soluzione più completa

# Installazione su Ubuntu/Debian
sudo apt update
sudo apt install libreoffice python3-uno

# Installazione su macOS (con Homebrew)
brew install libreoffice

# Installazione su Windows:
# Scarica LibreOffice da https://www.libreoffice.org/download/download/

# Installa il wrapper Python
pip install unoconv

Esempio d’uso:

import subprocess

# Converti DOCX a PDF
subprocess.run(['unoconv', '-f', 'pdf', 'documento.docx'])

# Converti ODT a DOCX
subprocess.run(['unoconv', '-f', 'docx', 'documento.odt'])

2. Pandoc - Per documenti semplici (Markdown, LaTeX, ecc.)

# Installazione
sudo apt install pandoc  # Linux
brew install pandoc     # macOS
# Windows: https://pandoc.org/installing.html

pip install pypandoc

Esempio d’uso:

import pypandoc

# Converti Markdown a DOCX
pypandoc.convert_file('input.md', 'docx', outputfile='output.docx')

# Converti LaTeX a PDF
pypandoc.convert_file('input.tex', 'pdf', outputfile='output.pdf')

3. Pure Python Libraries (Zero dipendenze esterne)

a) Per DOCX/ODT:

pip install python-docx odfpy

Esempio conversione ODT → DOCX:

from odfpy import OpenDocument, load
from docx import Document

def odt_to_docx(input_path, output_path):
    doc = Document()
    odt = load(input_path)
    
    for para in odt.getElementsByType('paragraph'):
        doc.add_paragraph(para.getAttribute('text'))
    
    doc.save(output_path)

odt_to_docx('input.odt', 'output.docx')

b) Per PDF:

pip install pdf2docx 

Esempio PDF → DOCX:

from pdf2docx import Converter

cv = Converter('input.pdf')
cv.convert('output.docx', start=0, end=None)
cv.close()

4. Soluzioni alternative specifiche

a) Per fogli di calcolo (CSV/XLSX/ODS):

pip install pandas pyexcel pyexcel-xlsx pyexcel-ods

Esempio CSV → ODS:

import pandas as pd
df = pd.read_csv('input.csv')
df.to_excel('output.ods', engine='odf')

b) Per presentazioni (PPT → PDF):

pip install python-pptx

Esempio PPTX → PDF (richiede unoconv):

import subprocess
subprocess.run(['unoconv', '-f', 'pdf', 'presentazione.pptx'])

5. Docker per ambienti isolati

Se vuoi evitare installazioni di sistema:

docker run -v $(pwd):/convert -it docker.io/libreoffice/headless unoconv -f pdf /convert/documento.docx

Tabella riassuntiva delle alternative:

Formato Libreria Consigliata Comando Installazione
DOCX ↔︎ ODT odfpy + python-docx pip install odfpy python-docx
PDF ↔︎ DOCX pdf2docx pip install pdf2docx
XLSX ↔︎ ODS pandas pip install pandas pyexcel-ods
Presentazioni unoconv (via Docker) docker pull libreoffice/headless

Consigli finali:

  1. Per massima compatibilità: usa LibreOffice/UNO (anche via Docker)
  2. Per documenti semplici: Pandoc o soluzioni pure Python
  3. Per ambienti senza GUI: prediligi pdf2docx, python-docx, odfpy
  4. Evita pywin32 o librerie Windows-specifiche se vuoi cross-platform

English version

Below we will see how to convert documents quickly and easily with Pyhton!

1. LibreOffice/Uno (Unoconv) - The most complete solution

# Installation on Ubuntu/Debian 
SUDO APT UPDATE 
SUDO APT Install LibreOffice Python3-no 

# Installation on macOS (with homebrew) 
Brew Install LibreOffice 

# Installation on Windows: 
# Download LibreOffice from https://www.libreoffice.org/ownload/download/ 

# Install the Wrapper Python 
Pip Install Unoconv 

Example of use:

import subprocess 

# Convert Docx to PDF 
Subprocess.run (['Unoconv', '-f', 'pdf', 'document.docx']) 

# Convert ODT to Docx 
Subprocess.run (['Unoconv', '-f', 'docx', 'document.odt']) 

2. Pandoc - For simple documents (Markdown, Latex, etc.)

# Installation 
SUDO APT Install Pandoc # Linux 
Brew Install Pandoc # MacOS 
# Windows: https://pandoc.org/installing.html 

Pip Install Pypandoc 

Example of use:

import pypandoc 

# Convert Markdown to Docx 

pypandoc.convert_file ('input.md', 'docx', outputfile = 'output.docx') 

# Convert Latex to PDF 

pypandoc.convert_file ('input.tex', 'pdf', outputfile = 'output.pdf') 

3. Pure Python Libraries (zero external addictions)

A) For DOCX/ODT:

Pip Install Python-Docx Odfpy 

Example conversion ODT → Docx:

from Odfpy import opendocument, Load 
from Docx Import Document 

Def Odt_to_Docx (Input_path, output_path): 
DOC = Document () 
Odt = Load (input_path) 

for para in ODT.Getelementsbytyype ('paragraph'): 
Doc.Add_paragraph (Para.Getattribute ('Text')) 

Doc.save (Output_path) 

Odt_to_docx ('input.odt', 'output.docx') 

B) For PDF:

PIP Install PDF2DOCX 

Example PDF → Docx:

from pdf2docx import converter 

cv = converter ('input.pdf') 
CV.Convert ('Output.docx', start = 0, end = none) 
cv.close () 

4. Specific alternative solutions

A) for calculation sheets (CSV/XLSX/ODS):

PIP Install Pandas Pyexcel Pyexcel-Xlsx Pyexcel-Ods 

Example CSV → ODS:

import pandas as pd 
DF = PD.Read_csv ('input.csv') 
DF.TO_EXCEL ('Output.ods', Engine = 'Odf') 

B) For presentations (PPT → PDF):

PIP Install Python-AppTx 

Example PPTX → PDF (requires UNOCONV):

import subprocess 
Subprocess.run (['Unoconv', '-f', 'pdf', 'presentation.pptx']) 

5. Docker for isolated environments

If you want to avoid system installations:


Docker Run -V $ (PWD):/convert -it docker.io/libreoffice/headless unoconv -f pdf /convert/documento.docx 

Summary table of alternatives:

Format Recommended library Installation command
DOCX ↔︎ ODT ODFPY +Python-Docx PIP Install Odfpy Python-Docx
PDF ↔︎ DOCX PDF2DOCX PIP Install PDF2DOCX
XLSX ↔︎ ODS Pandas PIP Install Pandas Pyexcel-Ors' | | Presentations |UNOCONV(via Docker) |Docker Pull LibreOffice/Headless’

Final tips **:

  1. For maximum compatibility: USA ** LibreOffice/One ** (also via Docker)
  2. For simple documents: ** Pandoc ** or Pure Python solutions
  3. For GUI without environments: Predere PDF2DOCX,Python-Docx, Odfpy
  4. Avoid Pywin32 or Windows-Specific bookcases if you want cross-platform

Puoi seguire anche il mio canale YouTube https://www.youtube.com/channel/UCoOgys_fRjBrHmx2psNALow/ con tanti video interessanti


Per supportare e far crescere il canale in modo semplice, rapido e gratuito, potete fare acquisti su amazon usando il mio link di affiliazione.
Questo implica che io prenda una commissione ogni volta che qualcuno faccia un qualsiasi acquisto utilizzando il mio link di affiliazione https://amzn.to/4cgJ3Ls

Commenti