How To Convert XFA PDF Form to DOCX in Python Using GroupDocs.Conversion REST API

Vittorio12 · August 17, 2020, 4:28pm

Hi,

I get this message in the converted file (it says successful)

Please wait…
If this message is not eventually replaced by the proper contents of the document, your PDF
viewer may not be able to display this type of document.
You can upgrade to the latest version of Adobe Reader for Windows®, Mac, or Linux® by
visiting http://www.adobe.com/go/reader_download.
For more assistance with Adobe Reader visit http://www.adobe.com/go/acrreader.
Windows is either a registered trademark or a trademark of Microsoft Corporation in the United States and/or other countries. Mac is a trademark
of Apple Inc., registered in the United States and other countries. Linux is the registered trademark of Linus Torvalds in the U.S. and other
countries.

any idea how to solve this issue?

thank you
vittorio

tilal.ahmad · August 17, 2020, 4:57pm

@Vittorio12

We will appreciate it, if you please share your input and output documents here along with the sample code. We will investigate the issue and will share our findings with you.

Vittorio12 · August 17, 2020, 5:31pm

Hi,

I convert XFA PDF form to DOCX in Python. I have uploaded the file and here following the code. I would like to download the files too.

I believe the format of the file is XFA

thank you

Convert XFA Form to DOCX in Python using GroupDocs.Conversion REST API

# Import module
import groupdocs_conversion_cloud
import os
import uuid
import textract


# Get your app_sid and app_key at https://dashboard.groupdocs.cloud (free registration is required).
app_sid = "4f26ddd9-233d-4567-bu49-86f54c5136fa"
app_key = "a5c75y680820c7frr9791f7cd75ef5cf"

# Create instance of the API
convert_api = groupdocs_conversion_cloud.ConvertApi.from_keys(app_sid, app_key)
file_api = groupdocs_conversion_cloud.FileApi.from_keys(app_sid, app_key)

try:

        #upload soruce file to storage
        
        pdf_list_tot = []
        pdf_list = []
    
        dir_input = '/Users/vittoriopro/Documents/AI/MSDS/1_Capstone/Datasets/Tefin//MSC/FREE'
        dir_output = '/Users/vittoriopro/Documents/AI/MSDS/1_Capstone/Datasets/Tefin//MSC/DOC'
#       pdf_list_tot  = os.listdir(path=dir_input)
        
        
        pdf_list_tot  = os.listdir(path=dir_input)
        pdf_list = pdf_list_tot[0:3]
       
        
       
        for filename in pdf_list:
            
            strformat='docx'
            file, extension = os.path.splitext(filename)
            output_name = file + '.docx'
            filename = os.path.join(dir_input,filename)
            remote_name = os.path.join(dir_input,filename)
          
#            os.rename(filename, base + '.docx')
            
            output_name = os.path.join(dir_output,output_name)
            
            request_upload = groupdocs_conversion_cloud.UploadFileRequest(remote_name,filename)
            response_upload = file_api.upload_file(request_upload)
            
            #Convert PDF to Word document
            settings = groupdocs_conversion_cloud.ConvertSettings()
            settings.file_path = remote_name
            settings.format = strformat
            settings.output_path = output_name
            
            loadOptions = groupdocs_conversion_cloud.PdfLoadOptions()
            loadOptions.hide_pdf_annotations = True
            loadOptions.remove_embedded_files = False
            loadOptions.flatten_all_fields = True
    
            settings.load_options = loadOptions
    
            convertOptions = groupdocs_conversion_cloud.DocxConvertOptions()
            convertOptions.from_page = 1
            convertOptions.pages_count = 1
                
            settings.convert_options = convertOptions
                    
            request = groupdocs_conversion_cloud.ConvertDocumentRequest(settings)
            response = convert_api.convert_document(request)
    
            print("Document converted successfully: " + str(response))
except groupdocs_conversion_cloud.ApiException as e:
             print("Exception when calling get_supported_conversion_types: {0}".format(e.message))

SR_32936_MSC_NINA_F.pdf (866.7 KB)

tilal.ahmad · August 18, 2020, 4:43am

@Vittorio12

Yes, you are right. We get the error when we convert XFA PDF form to DOCX in Python using GroupDocs.Conversion REST API because of the XFA PDF form. We have logged a ticket CONVERSIONCLOUD-380 for further investigation and resolution.

Meanwhile, you can achieve your requirement in two steps. First convert your XFA PDF form to standard Acro form and then convert it to desired format(DOCX). However, we will share the complete solution in GroupDocs.Conversion Cloud API asap.