We're sorry GroupDocsCloud doesn't work properply without JavaScript enabled.

Free Support Forum - groupdocs.cloud

Conversion from pdf to txt

Hi,

I get this message in the converted file (it says successful)

Please wait…
If this message is not eventually replaced by the proper contents of the document, your PDF
viewer may not be able to display this type of document.
You can upgrade to the latest version of Adobe Reader for Windows®, Mac, or Linux® by
visiting http://www.adobe.com/go/reader_download.
For more assistance with Adobe Reader visit http://www.adobe.com/go/acrreader.
Windows is either a registered trademark or a trademark of Microsoft Corporation in the United States and/or other countries. Mac is a trademark
of Apple Inc., registered in the United States and other countries. Linux is the registered trademark of Linus Torvalds in the U.S. and other
countries.

any idea how to solve this issue?

thank you
vittorio

@Vittorio12

We will appreciate it, if you please share your input and output documents here along with the sample code. We will investigate the issue and will share our findings with you.

Hi,

I have uploaded the file and here following the code. I would like to download the files too.

I believe the format of the file is XFA

thank you

Import module

import groupdocs_conversion_cloud
import os
import uuid
import textract

Get your app_sid and app_key at https://dashboard.groupdocs.cloud (free registration is required).

app_sid = “4f26ddd9-216d-4607-bc49-86f54c5136fa”
app_key = “a5c75f680820c7faa9791f7cd75ef5cf”

Create instance of the API

convert_api = groupdocs_conversion_cloud.ConvertApi.from_keys(app_sid, app_key)
file_api = groupdocs_conversion_cloud.FileApi.from_keys(app_sid, app_key)

try:

    #upload soruce file to storage
    
    pdf_list_tot = []
    pdf_list = []

    dir_input = '/Users/vittoriopro/Documents/AI/MSDS/1_Capstone/Datasets/Tefin//MSC/FREE'
    dir_output = '/Users/vittoriopro/Documents/AI/MSDS/1_Capstone/Datasets/Tefin//MSC/DOC'

pdf_list_tot = os.listdir(path=dir_input)

    pdf_list_tot  = os.listdir(path=dir_input)
    pdf_list = pdf_list_tot[0:3]
   
    
   
    for filename in pdf_list:
        
        strformat='docx'
        file, extension = os.path.splitext(filename)
        output_name = file + '.docx'
        filename = os.path.join(dir_input,filename)
        remote_name = os.path.join(dir_input,filename)

os.rename(filename, base + ‘.docx’)

        output_name = os.path.join(dir_output,output_name)
        
        request_upload = groupdocs_conversion_cloud.UploadFileRequest(remote_name,filename)
        response_upload = file_api.upload_file(request_upload)
        
        #Convert PDF to Word document
        settings = groupdocs_conversion_cloud.ConvertSettings()
        settings.file_path = remote_name
        settings.format = strformat
        settings.output_path = output_name
        
        loadOptions = groupdocs_conversion_cloud.PdfLoadOptions()
        loadOptions.hide_pdf_annotations = True
        loadOptions.remove_embedded_files = False
        loadOptions.flatten_all_fields = True

        settings.load_options = loadOptions

        convertOptions = groupdocs_conversion_cloud.DocxConvertOptions()
        convertOptions.from_page = 1
        convertOptions.pages_count = 1
            
        settings.convert_options = convertOptions
                
        request = groupdocs_conversion_cloud.ConvertDocumentRequest(settings)
        response = convert_api.convert_document(request)

        print("Document converted successfully: " + str(response))

except groupdocs_conversion_cloud.ApiException as e:
print(“Exception when calling get_supported_conversion_types: {0}”.format(e.message))

SR_32936_MSC_NINA_F.pdf (866.7 KB)

@Vittorio12

Yes, you are right. We are facing issue because of PDF XFA form. We have logged a ticket CONVERSIONCLOUD-380 for further investigation and resolution. Meanwhile, you can achieve your requirement in two steps. First convert your PDF XFA form to standard Acro form and then convert it to desired format(docx). However, we will share complete solution in GroupDocs.Conversion Cloud API asap.