Python Convert PDF to Microsoft Word Document without Adobe Acrobat Installed

The groupdocs.cloud’s API generated the following error:

Error: ConvertDocument. Parameters: convertSettings ‘{“FilePath”:“Sample.pdf”,“Format”:“docx”,“LoadOptions”:{“RemoveEmbeddedFiles”:false,“HidePdfAnnotations”:true,“FlattenAllFields”:true},“ConvertOptions”:{“Width”:0,“Height”:0,“Dpi”:96.0,“Zoom”:100,“FromPage”:1,“PagesCount”:1},“OutputPath”:“sample.docx”}’. Exception: The surrogate pair (0xD835, 0xD835) is invalid. A high surrogate character (0xD800 - 0xDBFF) must always be paired with a low surrogate character (0xDC00 - 0xDFFF)…

  • Trace Id: 1-5ec490c8-8d550b75af9c69a8a59db819
  • Timestamp: 5/20/2020 2:07:04 AM

Notes:

  • This post is private. Only our employees and the author can see it unless the customer makes it public.
  • To help us better investigate the problem, please consider attaching the document that can repeat the error, which may or may not be the original document that caused the error.

@klg45014

We are sorry for the inconvenience. We have tested the scenario and unable to reproduce your reported issue. Please find sample Python code to convert PDF to Microsoft Word Document without Adobe Acrobat installed.

Steps to convert PDF to MS Word in Python

  • Free sign up with groupdocs.cloud to get credentials
  • Install GroupDocs.Conversion Cloud SDK for PHP from PIP
  • Create a script file and import groupdocs_conversion_cloud package
  • Create Instance of the API
  • Upload PDF file to cloud storage
  • Convert PDF to Word Document
  • Download output Word Document from cloud storage

Code: Python Convert PDF to Microsoft Word Document

    # Import module
    import groupdocs_conversion_cloud
    from shutil import copyfile

    # Get your app_sid and app_key at https://dashboard.groupdocs.cloud (free registration is required).
    app_sid = "xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx"
    app_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

    # Create instance of the API
    convert_api = groupdocs_conversion_cloud.ConvertApi.from_keys(app_sid, app_key)
    file_api = groupdocs_conversion_cloud.FileApi.from_keys(app_sid, app_key)

    try:

            #upload soruce file to storage
            filename = '02_pages.pdf'
            remote_name = '02_pages.pdf'
            output_name= 'sample.docx'
            strformat='docx'

            request_upload = groupdocs_conversion_cloud.UploadFileRequest(remote_name,filename)
            response_upload = file_api.upload_file(request_upload)
            
            #Convert PDF to DOCX
            settings = groupdocs_conversion_cloud.ConvertSettings()
            settings.file_path =remote_name
            settings.format = strformat
            settings.output_path = output_name

            loadOptions = groupdocs_conversion_cloud.PdfLoadOptions()
            loadOptions.hide_pdf_annotations = True
            loadOptions.remove_embedded_files = False
            loadOptions.flatten_all_fields = True

            settings.load_options = loadOptions

            convertOptions = groupdocs_conversion_cloud.DocxConvertOptions()
            convertOptions.from_page = 1
            convertOptions.pages_count = 1
                
            settings.convert_options = convertOptions
                    
            request = groupdocs_conversion_cloud.ConvertDocumentRequest(settings)
            response = convert_api.convert_document(request)
            print("Document converted successfully: " + str(response))
            
            #Download Document from Storage        
            request_download = groupdocs_conversion_cloud.DownloadFileRequest(output_name)
            response_download = file_api.download_file(request_download)
           
            copyfile(response_download, 'sample_copy.docx')
            print("Result {}".format(response_download))
            
    except groupdocs_conversion_cloud.ApiException as e:
            print("Exception when calling get_supported_conversion_types: {0}".format(e.message))