Pure go library for creating and processing Office Word (.docx), Excel (.xlsx) and Powerpoint (.pptx) documents

Overview

unioffice is a library for creation of Office Open XML documents (.docx, .xlsx and .pptx). Its goal is to be the most compatible and highest performance Go library for creation and editing of docx/xlsx/pptx files.

Build Status GitHub (pre-)release License: UniDoc EULA GoDoc

https://github.com/unidoc/unioffice/

Status

  • Documents (docx) [Word]
    • Read/Write/Edit
    • Formatting
    • Images
    • Tables
  • Spreadsheets (xlsx) [Excel]
    • Read/Write/Edit
    • Cell formatting including conditional formatting
    • Cell validation (drop down combobox, rules, etc.)
    • Retrieve cell values as formatted by Excel (e.g. retrieve a date or number as displayed in Excel)
    • Formula Evaluation (100+ functions supported currently, more will be added as required)
    • Embedded Images
    • All chart types
  • PowerPoint (pptx) [PowerPoint]
    • Creation from templates
    • Textboxes/shapes

Performance

There has been a great deal of interest in performance numbers for spreadsheet creation/reading lately, so here are unioffice numbers for this benchmark which creates a sheet with 30k rows, each with 100 columns.

creating 30000 rows * 100 cells took 3.92506863s
saving took 89ns
reading took 9.522383048s

Creation is fairly fast, saving is very quick due to no reflection usage, and reading is a bit slower. The downside is that the binary is large (33MB) as it contains generated structs, serialization and deserialization code for all of DOCX/XLSX/PPTX.

Installation

go get github.com/unidoc/unioffice/

Document Examples

Spreadsheet Examples

Presentation Examples

Raw Types

The OOXML specification is large and creating a friendly API to cover the entire specification is a very time consuming endeavor. This library attempts to provide an easy to use API for common use cases in creating OOXML documents while allowing users to fall back to raw document manipulation should the library's API not cover a specific use case.

The raw XML based types reside in the schema/ directory. These types are accessible from the wrapper types via a X() method that returns the raw type.

For example, the library currently doesn't have an API for setting a document background color. However it's easy to do manually via editing the CT_Background element of the document.

dox := document.New()
doc.X().Background = wordprocessingml.NewCT_Background()
doc.X().Background.ColorAttr = &wordprocessingml.ST_HexColor{}
doc.X().Background.ColorAttr.ST_HexColorRGB = color.RGB(50, 50, 50).AsRGBString()

Contribution guidelines

CLA assistant

All contributors are must sign a contributor license agreement before their code will be reviewed and merged.

Licensing

This software package (unioffice) is a commercial product and requires a license code to operate.

The use of this software package is governed by the end-user license agreement (EULA) available at: https://unidoc.io/eula/

To obtain a Trial license code to evaluate the software, please visit https://unidoc.io/

Comments
  • Nil pointer when trying to extract image data

    Nil pointer when trying to extract image data

    Description

    Hey folks. I am trying to extract raw image data from an MS word document. Here is the code snippet:

            doc, err := document.Read(reader, reader.Size())
    	if err != nil {
    		return "", nil, fmt.Errorf("document read failure with error: %v", err)
    	}
    	defer doc.Close()
    
    	for _, img := range doc.Images {
    		if img.Data() == nil {
    			ctx.Logger().Warn("received an image with nil data")
    			continue
    		}
    		_, imgResults, err := ScanImage(ctx, clients, n, f, *img.Data())
    		if err != nil {
    			ctx.Logger().Errorf("image scanning failure with error: %v", err)
    		}
    	}
    

    The doc.Images array is successfully populated with the number of images in the document however when I call img.Data() I receive a nil pointer.

    Expected Behavior

    img.Data() should return a non-nil pointer

    Actual Behavior

    img.Data() returns a nil pointer

    Please include a reproducible code snippet or document attachment that demonstrates the issue.

    opened by JNimkarLS 16
  • Some Bugs In Windows

    Some Bugs In Windows

    Description

    Hi,author. First of all,I love this lib,and my english is not good.If I used the wrong words, please forgive me. When I watching your example, I was so exciting.But after I run your example's code.The worry is appear.Open the generate file,the windows tell me, can open this file.

    Expected Behavior

    Actual Behavior

    Please include a reproducible code snippet or document attachment that demonstrates the issue.

    opened by siskinc 14
  • Unsupported purl.oclc.org (strict ooxml namespace)

    Unsupported purl.oclc.org (strict ooxml namespace)

    Description

    I'm hitting a weird issue when updating FldChar's. I need to replace the default text in some form fields. If I mark the fields as dirty or set SetUpdateFieldsOnOpen(true) and then select to update the entire table as prompted when the document opens, saving the result makes it unusable by gooxml. I get the following warnings and doc.X().Body is nil:

    2019/02/19 16:34:16 unsupported relationship type: http://purl.oclc.org/ooxml/officeDocument/relationships/officeDocument tgt: word/document.xml
    2019/02/19 16:34:16 unsupported relationship type: http://purl.oclc.org/ooxml/officeDocument/relationships/extendedProperties tgt: docProps/app.xml
    

    Digging through the raw XML, it doesn't look like an issue with the actual FldChar's that are being altered, it looks like an issue with the document namespacing. For some reason, after updating the form fields, some of the http://schemas.openxmlformats.org/officeDocument/2006/... attributes get replaced with http://purl.oclc.org/ooxml/officeDocument/....

    I don't know enough about the spec to know why these would change. But the resulting document opens just fine with Word. It is only gooxml that has an issue.

    I tried this with and without the presence of a table of contents. When the table of contents isn't present, there is no issue. When the table of contents is present, and you select "update entire table", then you experience the problem after saving the document.

    Expected Behavior

    Handle documents using purl.oclc.org namespacing appropriately.

    Actual Behavior

    Documents with purl.oclc.org namespacing have a nil Body.

    I've attached before and after documents. The actual FldChar changes appear starting on page 11 (a result of re-using code that originally produced the issue). If you need the code making the changes or simplified before/after documents, just let me know.

    after.docx before.docx

    opened by freb 13
  • Support adding/replacing MERGEFIELDs

    Support adding/replacing MERGEFIELDs

    I'm trying to understand if/how 'MERGEFIELDS' are supported within gooxml, or if it is the kind of thing I would need to drop into .X() to handle?

    I did see that there are doc.FormFields(), r.AddField(), etc functions, but as best I could tell, these didn't seem to do what I want. I also came across the 'KnownFields', which seems to correlate with this, but couldn't tell if it was associated to some deeper support/code:

    • https://github.com/baliance/gooxml/blob/master/document/knownfields.go

    Essentially, is there a way to create, read, edit/update, etc these elements in a gooxml native way currently? And if not, do you have any suggestions of the best way to interact with them?

    Below is a snippet from a document that uses these fields:

    <w:p w14:paraId="1566BC4D" w14:textId="3B6A9F12" w:rsidR="006D368D" w:rsidRPr="00497636" w:rsidRDefault="000E0283">
            <w:pPr>
                <w:rPr>
                    <w:lang w:val="en-AU"/>
                </w:rPr>
            </w:pPr>
            <w:r>
                <w:rPr>
                    <w:lang w:val="en-AU"/>
                </w:rPr>
                <w:t>Merge Field:</w:t>
            </w:r>
            <w:r w:rsidR="006D368D">
                <w:rPr>
                    <w:lang w:val="en-AU"/>
                </w:rPr>
                <w:t xml:space="preserve">
                </w:t>
            </w:r>
            <w:r w:rsidRPr="00497636">
                <w:rPr>
                    <w:lang w:val="en-AU"/>
                </w:rPr>
                <w:fldChar w:fldCharType="begin"/>
            </w:r>
            <w:r w:rsidRPr="00497636">
                <w:rPr>
                    <w:lang w:val="en-AU"/>
                </w:rPr>
                <w:instrText xml:space="preserve"> MERGEFIELD  $Foo.Bar  \* MERGEFORMAT </w:instrText>
            </w:r>
            <w:r w:rsidRPr="00497636">
                <w:rPr>
                    <w:lang w:val="en-AU"/>
                </w:rPr>
                <w:fldChar w:fldCharType="separate"/>
            </w:r>
            <w:r w:rsidRPr="00497636">
                <w:rPr>
                    <w:lang w:val="en-AU"/>
                </w:rPr>
                <w:t>«$Foo.Bar»</w:t>
            </w:r>
            <w:r w:rsidRPr="00497636">
                <w:rPr>
                    <w:lang w:val="en-AU"/>
                </w:rPr>
                <w:fldChar w:fldCharType="end"/>
            </w:r>
        </w:p>
    

    Refs:

    • https://github.com/baliance/gooxml/blob/master/document/knownfields.go
    • http://officeopenxml.com/WPfields.php
    • http://officeopenxml.com/WPfieldInstructions.php
    • http://officeopenxml.com/WPgeneralFieldSwitches.php
    opened by 0xdevalias 12
  • recompile the example and cannot open the generate .docx

    recompile the example and cannot open the generate .docx

    experimental environment: 1.OS:windows 7 x64、VScode 2.Go version 1.9 first,i use "go get baliance.com/gooxml",after that,i use "go build baliance.com/gooxml/..."but the compiler error:"The filename or extension is too long." so i rename the"schemas.openxmlformats.org" to "s" and change all the "schemas.openxmlformats.org" to "s" of the path in all files. finally,it compiles successfully. i test the file in "_examples/document/tables/main.go" ------go run main.go--- ------success----(maybe success,no error and generate the" tables.docx" file)------ ------cannot open the tables.docx------ so i test other examples,they all generate the .docx but cannot open with "MS Office "

    cheers,looking for your replay.

    opened by thxallvu 12
  • add the ability to utilize footnotes and endnotes in documents

    add the ability to utilize footnotes and endnotes in documents

    Added:

    • Basic CRUD functions to handle both endnotes and footnotes
    • Tester functions in convention of the library (e.g., HasFootnotes or IsFootnote)
    • Tests to cover the added functionality.

    This change is Reviewable

    opened by compleatang 9
  • [Question] Handling excel data

    [Question] Handling excel data

    As far as i saw, there is no way to get the data from an excel file as a matrix when running row by row. unioffice removes empty cols in a row.

    Is there any way to get the whole data of an excel, even with empty cols?

    opened by polderudo 9
  • [UO-129] Incorrect conversion of doc to pdf

    [UO-129] Incorrect conversion of doc to pdf

    Description

    Incorrect conversion of doc to pdf

    Expected Behavior

    doc to pdf converted correctly

    Actual Behavior

                outputPath := fmt.Sprintf("output/%s.pdf", filename)
    	doc, err := document.Open("https://github.com/unidoc/unioffice/files/8096886/liuna.docx")
    	if err != nil {
    		log.Fatalf("error opening document: %s", err)
    	}
    	defer doc.Close()
    	c := convert.ConvertToPdf(doc)
    
    	err = c.WriteToFile(outputPath)
    	if err != nil {
    		log.Fatalf("error converting document: %s", err)
    	}
    

    But the conversion result is incorrect,I tried using https://foxyutils.com/wordtopdf/ the conversion is correct

    opened by springswen 8
  • Bug in RunProperties.IsBold()

    Bug in RunProperties.IsBold()

    This is the method code:

    func (r RunProperties) IsBold() bool {
    	return r.x.B != nil
    }
    

    It works in most cases, since a non-bold run's properties look like this:

                B: (*wml.CT_OnOff)(<nil>),
    

    And a bold run's properties look like this:

                    B: (*wml.CT_OnOff)(0xc42000e450)({
                     ValAttr: (*sharedTypes.ST_OnOff)(<nil>)
                    }),
    

    However, if the style uses bold by default, and a run turns it off, then B will look like this:

                    B: (*wml.CT_OnOff)(0xc42000e3a0)({
                     ValAttr: (*sharedTypes.ST_OnOff)(0xc420011d80)(false)
                    })
    

    Therefore, IsBold() would falsely say that the run is bold.

    This seems to map directly from XML where <w:b/> is short for <w:b val="true"/>, and turning bold off requires <w:b val="false"/>.

    Actually I can't say what the proper behaviour should be, because a boolean is not enough to express the three possible states:

    • No change to boldness
    • Turn on boldness
    • Turn off boldness

    Perhaps two methods are needed: IsBold() and BoldSet():

    func (r RunProperties) IsBold() bool {
    	if r.x.B != nil {
    		if r.x.B.ValAttr != nil && r.x.B.ValAttr.Bool != nil && *r.x.B.ValAttr.Bool == false {
    			return false
    		} else {
    			return true
    		}
    	} else {
    		return false
    	}
    }
    func (r RunProperties) BoldSet() bool {
    	return r.x.B != nil
    }
    

    I also noticed an oddity in IsItalic():

    func (r RunProperties) IsItalic() bool {
    	if r.x == nil {
    		return false
    	}
    	return r.x.I != nil
    }
    

    All other methods assume that r.x is safely non-nil, so why this?

    opened by preciselytom 8
  • Does not compile on Windows

    Does not compile on Windows

    Description

    Trying to use document.New() on Windows.

    Expected Behavior

    It compiles.

    Actual Behavior

    Getting an error:

    go build baliance.com/gooxml/schema/soo/wml: C:\Go\pkg\tool\windows_amd64\compile.exe: fork/exec C:\Go\pkg\tool\windows_amd64\compile.exe: The filename or extension is too long.
    
    opened by levrik 8
  • Unable to Extract Images from Docx File

    Unable to Extract Images from Docx File

    Description

    I have a docx file that contains a single image. Here's the code I am using to pull out the images:

    	reader := bytes.NewReader(content)
    	doc, err := presentation.Read(reader, reader.Size())
    	if err != nil {
    		return "", nil, fmt.Errorf("presentation read failure with error: %v", err)
    	}
    	if doc == nil {
    		return "", nil, fmt.Errorf("internal error: [presentation.Read] returned a nil pointer")
    	}
    	defer doc.Close()
    
    	for _, img := range doc.Images {
    		if img.Path() == "" {
    			ctx.Logger().Warn("received an image with an empty path")
    			continue
    		}
    		data, err := os.ReadFile(img.Path())
    		if err != nil {
    			ctx.Logger().Error("failed to read file: %s with error: %v", img.Path(), err)
    			continue
    		}
    	}
    	extracted := doc.ExtractText()
    

    Expected Behavior

    For the file I've attached below, it contains one image. However, the length of doc.Images is 0 and it should be a length of 1. As a note, I have created this document on Microsoft OneDrive and downloaded it from OneDrive to share with you.

    Actual Behavior

    Length of doc.Images is 0 and does not enter for loop J1.docx

    Please include a reproducible code snippet or document attachment that demonstrates the issue.

    opened by JNimkarLS 7
  • How to know a textline belongs to which page in a docx file

    How to know a textline belongs to which page in a docx file

    Description

    Is there any way to know a text line belongs to which page in a .docx file?

    Expected Behavior

    We should have an attribute in a text Item to get the page information:

    for ei, e := range extracted.Items {
          text: = e.Text`
          page_index = e.PageIndex
    

    Actual Behavior

    There is only Text, DrawingInfo, Paragraph, Hyperlink, TableInfo BUT no PageInfo in the TextItem.

    Please include a reproducible code snippet or document attachment that demonstrates the issue.

    opened by hoangthanh283 1
  • Can I add html markup to doc files?

    Can I add html markup to doc files?

    Description

    Is there a plan for the word document package to support html tags or how i can impliment the print the below in creating word

    <b>Bold first</b><div><i><b>Bold second</b></i></div><div><i><b>There are so many of us </b>asasas</i>asas<i>asasasasas<u>asasasa</u></i></div><div><i><u><br></u></i></div>
    

    Expected Behavior

    The above html string should display as bold, italic , underline formatted

    Actual Behavior

    Printing as string on AddText()

    Please include a reproducible code snippet or document attachment that demonstrates the issue.

    opened by brandonheng168 0
  • Document append problem

    Document append problem

    Description

    When I use doc0.Append(doc1), the output file open fail

    Expected Behavior

    Actual Behavior

    image

    Please include a reproducible code snippet or document attachment that demonstrates the issue.

    opened by shenghui0779 3
  • column.SetStyle does not work with wrapped style

    column.SetStyle does not work with wrapped style

    When using the following function, it doesn't seem to set the cells to wrapped properly. If I use the exact same process on an individual cell it works properly. https://pkg.go.dev/gitea.com/unidoc/unioffice/spreadsheet#Column.SetStyle

    Expected Behavior

    When using Column.SetStyle with the wrapped style, all cells in that column should become wrapped.

    An example of the issue:

    style := ss.StyleSheet.AddCellStyle()
    style.SetWrapped(true)
    sheet.Column(1).SetStyle(style)
    sheet.Column(1).SetWidth(200)
    sheet.Cell("A7").SetStyle(style)
    

    In this example, the cell A7 will be wrapped, but no other cells in the first column will be wrapped. I can verify this is the correct column because SetWidth(200) works as intended.

    opened by Logan9312 1
  • [Feature Request] Provide access to the

    [Feature Request] Provide access to the "Alias" and "Tag" properties of StructuredDocumentTag objects.

    Description

    Currently, while structured document tags are available for a document, programmatic access to the title and tag properties for each structured document tag is not. The API only provides access to the paragraphs.

    Expected Behavior

    Each StructuredDocumentTag object provides Alias() and Tag() functions.

    for _, sdt := range doc.StructuredDocumentTags() {
        fmt.Printf("Alias: '%v'\n", sdt.Alias()) // returns an alias or empty string for the SDT
        fmt.Printf("Tag: '%v'\n", sdt.Tag()) // returns a tag or empty string for the SDT
    }
    

    Actual Behavior

    No access to alias or tag for a structured document tag.

    Please include a reproducible code snippet or document attachment that demonstrates the issue.

    opened by glorious-beard 1
Releases(v1.21.1)
  • v1.21.1(Dec 6, 2022)

    Release notes - UniOffice - v1.21.1

    This release contains fixes and improvements.

    Improvements:

    • UO-136 DOCX to PDF Conversion paragraph line spacing improvements
    • UO-151 DOCX to PDF anchor drawing for attribute allowOverlap

    Bug Fixes:

    • UO-148 XLXS SharedString.GetString() panic error
    • UO-150 Fix error message customer name being swapped with expected.
    • UO-152 Use specific go-ole version to prevent x/sys being upgraded and not compatible with go1.17 below
    Source code(tar.gz)
    Source code(zip)
  • v1.21.0(Sep 2, 2022)

    Release notes - UniOffice - Version 1.21.0

    This release contains new features, fixes and improvements.

    Improvements

    • Github Issue 427 DOCX to PDF conversion font name lookup improvement.
    • CI workflows Go version updates.

    Fixes

    • UO-142 Fix small typo in README.

    New features

    • Metered API Key persistent cache and non persistent cache.
    • USD-193 Table row can't split.

    Added examples

    • Example for row can't split.
    Source code(tar.gz)
    Source code(zip)
  • v1.20.0(Jun 27, 2022)

    Release notes - UniOffice - Version v1.20.0

    This release contains multiple improvements in DOCX to PDF conversions.

    Improvements

    • DOCX to PDF conversion is not supporting (a), (b) lists properly.[UO-130]
    • DOCX to PDF conversion showing repeated text whereas it is not in Word output. [UO-131]
    • DOCX to PDF conversion not including page number: 13.docx (header and footer). [UO-132]
    • DOCX to PDF conversion switches to a different font all of a sudden. [UO-133]
    • DOCX to PDF conversion: text inaccurately underlined. [UO-134]
    • DOCX to PDF conversion: Paragraph text goes to a new page when the remaining space couldn't fit the paragraph texts. [UO-135]
    Source code(tar.gz)
    Source code(zip)
  • v1.19.0(Apr 9, 2022)

    Release notes - UniOffice - Version v1.19.0

    This release contains improvements and added examples.

    Improvements

    • Word to PDF conversions: Support font script lookup from settings [UO-129]. This addresses a problem that was reported here https://github.com/unidoc/unioffice/issues/462

    • Add SlideSize and SetSize for presentation slide size [UO-127] This enables setting slide size for presentations other than the default.

    Added examples

    • Numbering usage with nested numbering and customize its indent e.g with roman, alphabet, bullet. [UO-128].
      • https://github.com/unidoc/unioffice-examples/blob/master/document/bullet-and-numbering/main.go
    • Setting presentation slide size
      • https://github.com/unidoc/unioffice-examples/blob/master/presentation/slide-size/main.go
    Source code(tar.gz)
    Source code(zip)
  • v1.18.0(Feb 25, 2022)

    This release is focused on continued improvements in DOCX to PDF conversion quality.

    Improvements

    Improved docx to pdf conversion better accounting for tables and tab stops [USD-180] Better results more closely resembling MS Word can now be obtained for a wider range of files.

    Changes

    Notable changes include additional fields in convert.Options

    // EnableFontSubsetting process document with subsetting font to reduce size result.
    // Default value is `true`.
    EnableFontSubsetting bool
    
    // FontFiles location of fonts for convert process.
    FontFiles []string
    
    // FontDirectory location of font directory for convert process.
    // This will load all font files inside directoy if set
    // we recommend to use FontFiles for better performance.
    FontDirectory string
    
    Source code(tar.gz)
    Source code(zip)
  • v1.17.1(Feb 5, 2022)

    This release is focused on improvements in DOCX to PDF conversion quality.

    Improvements

    • Improved docx to pdf conversion better accounting for default style and font inheritance [USD-178] Better results more closely resembling MS Word can now be obtained.
    Source code(tar.gz)
    Source code(zip)
  • v1.17.0(Jan 15, 2022)

    UniOffice v1.17.0 includes multiple new features and a few significant improvements. ActiveX form field support has been added, line breaks can now be added easily, and Nodes can be used to easily search and replace text by pattern, enabling powerful redaction. PDF conversion for docx files also keeps being improved.

    New features:

    • Paragraph border support added. Enables an easy way to add line-breaks, similar to in MS Word, where one can type in "~~~", "***", "---" followed by newline/Enter to insert a horizontal rule. [UO-117] Example: https://github.com/unidoc/unioffice-examples/tree/master/document/paragraph-borders/main.go shows a few different cases for paragraph borders including line breaks.

    • Support for ActiveX form fields. Support added for reading and editing MS Word ActiveX fields. [UO-65] Example: https://github.com/unidoc/unioffice-examples/tree/master/document/form-activex/main.go demonstrates how get and change values from ActiveX forms.

    • Text redaction with Nodes to find and replace text. Adds easy to use functions for search and replace of text supporting text and regular expressions. [UO-108] Example: https://github.com/unidoc/unioffice-examples/tree/master/document/node-find-and-replace/main.go shows how to use nodes to easily replace text in a docx file.

    Improvements:

    We added more test cases and fixed cases where the PDF conversion was faulty.

    • Account for style inheritance in PDF conversion. [UO-114]

    • docx to PDF crashes: nil error [UO-119]

    • PDF conversion for fields improvements. Improve conversion of fields, and added an option to process fields that is off by default. If enabled, it is similar to how LibreOffice enables displaying hidden fields. [UO-109] Examples: https://github.com/unidoc/unioffice-examples/tree/master/document/convert_to_pdf/main.go https://github.com/unidoc/unioffice-examples/tree/master/document/convert_to_pdf_options/main.go Can see the difference in output when processing the merge_fields conversion case with and without the ProcessFields option, which causes fields to be processed and displayed if hidden (LibreOffice feature). The default behavior is consistent with MS Word.

    • Idempotency style when style already exist. The document AddStyle function now returns a style if the styleID already exist. [UO-116]

    NOTE: There is technically one breaking change in this release that we believe should not affect any users. The common/tempstorage tempstorage.File interface now has added io.ReaderAt requirement. This was needed to add the ActiveX support which is based on binary file processing. The provided implementations memstore and diskstore have been updated. We do not believe that there is any other potential use of the interface, so the change was accepted in this minor version.

    Source code(tar.gz)
    Source code(zip)
  • v1.16.0(Nov 10, 2021)

    UniOffice v1.16.0 includes new features and improvements. Node support now enables working with docx document contents in a generic fashion and makes it easy to find and copy content across documents. In addition, significant improvements have been made in PDF conversion quality. Log level support also reduces noise in outputs on standard output by default, but enables getting more detailed debug logs as needed.

    Changes:

    • Node support enables working generically with documents to find and copy contents across docx files. [UO-98].
    • Improved logging with multiple log levels. Added common/logger package [UO-96]
    • PPTX to PDF improvement: List support [UO-107]
    • DOCX to PDF improvement: Retaining field data in conversions [UO-109]

    New examples:

    • Node: Combining selected docx document contents from multiple files into one. https://github.com/unidoc/unioffice-examples/tree/master/document/node-combine
    • Node: Extracting selected docx document contents https://github.com/unidoc/unioffice-examples/tree/master/document/node-extraction
    • Node: Identifying and selecting specific document contents and saving to file https://github.com/unidoc/unioffice-examples/tree/master/document/node-selection
    Source code(tar.gz)
    Source code(zip)
  • v1.15.0(Sep 22, 2021)

    UniOffice v1.15.0 introduces support for conversions Powerpoint PPTX presentations to PDF files. In addition, a few other enhancements have been made.

    Changes:

    [UO-101] Sequence numbers are not part of extracted text [UO-95] Crash when converting Word to PDF [UO-106] Improve Word to PDF paragraph spacing and such [UO-104] Convert pptx to pdf

    New examples for UniOffice v1.15.0:

    • Powerpoint PPTX to PDF conversion examples: https://github.com/unidoc/unioffice-examples/tree/master/presentation/convert_to_pdf

    • Text extraction with numbering for Word docx documents https://github.com/unidoc/unioffice-examples/tree/master/document/text_extraction_with_numbering/main.go

    Source code(tar.gz)
    Source code(zip)
  • v1.14.0(Aug 23, 2021)

    UniOffice v1.14.0 includes the following changes:

    • Convert Excel spreadsheets (XLSX) to PDF [UO-94]
    • Fix duplicate mc:Ignorable attribute [UO-93]
    • Mail merge header image issue [UO-100]
    • Several images error fix [UO-103]

    New examples for UniOffice v1.14.0:

    • Example for XLSX to PDF conversion https://github.com/unidoc/unioffice-examples/tree/master/spreadsheet/convert_to_pdf
    • Licensing updated to use unicloud metered license by default in examples and providing another example for offline license key usage https://github.com/unidoc/unioffice-examples/tree/master/license
    Source code(tar.gz)
    Source code(zip)
  • v1.13.0(Jul 30, 2021)

    This minor version release includes the following changes:

    • Word docx watermark support. Text and image based watermarks. [UO-62]
    • Powerpoint extraction fixes [UO-92]
    • Word docx to pdf conversion improvements - Support composite fonts, including chinese, japanese, korean symbolic font files. [UO-91]

    New examples for UniOffice v1.13.0

    • DOCX to PDF conversion with custom composite symbolic fonts (chinese, japanese, korean for instance). https://github.com/unidoc/unioffice-examples/tree/master/document/doc-to-pdf-fonts
    • DOCX to PDF example updates https://github.com/unidoc/unioffice-examples/blob/master/document/doc-to-pdf
    • Adding an image-based watermark to a docx file https://github.com/unidoc/unioffice-examples/tree/master/document/watermark-picture
    • Adding a text-based watermark to a docx file https://github.com/unidoc/unioffice-examples/tree/master/document/watermark-text
    Source code(tar.gz)
    Source code(zip)
  • v1.12.0(Jun 16, 2021)

    This minor version release includes the following changes:

    • Multiple PDF conversion fixes: Chart handling, indentation and font styles, hyperlinks

    • Add image wrapping options (document package) [UO-80]

    • Paragraph indent and line spacing with example page size and orientation (document package) [UO-83]

    • Extract hyperlink in r.Text() [UO-90] Resolves #268

    New examples

    Source code(tar.gz)
    Source code(zip)
  • v1.11.0(May 31, 2021)

    This minor version adds

    • Add capability to set Cell Protection in spreadsheets [UO-88] Added example: https://github.com/unidoc/unioffice-examples/tree/master/spreadsheet/cell-protection
    • Minor schema fixes
    Source code(tar.gz)
    Source code(zip)
  • v1.10.0(Apr 24, 2021)

  • v1.9.0(Mar 16, 2021)

    This minor release adds:

    • Metered license support
    • Add Any fields for CT_Background, CT_Object, CT_Picture [UO-78]
    • Add activeX control type to parse activeX fields. [UO-73]
    Source code(tar.gz)
    Source code(zip)
  • v1.8.0(Jan 6, 2021)

    The v1.8.0 minor version release of UniOffice includes the following new features:

    • Text extraction for document, spreadsheet, presentation packages Both vectorized (objects) and plain text. New examples:
      • Word document docx text extraction: https://github.com/unidoc/unioffice-examples/blob/master/document/text_extraction/main.go
      • Excel spreadsheet text extraction: https://github.com/unidoc/unioffice-examples/blob/master/spreadsheet/text_extraction/main.go
      • Powerpoint presentation text extraction: https://github.com/unidoc/unioffice-examples/blob/master/presentation/text_extraction/main.go
    • Support for AlternateContent (Any) in runs and paragraphs (textbox support)
      • New example: https://github.com/unidoc/unioffice-examples/blob/master/document/run-properties/main.go As well as used in textbox extraction for paragraphs in the new text extraction support.
    Source code(tar.gz)
    Source code(zip)
  • v1.7.1(Dec 17, 2020)

    This patch version release 1.7.1 of UniOffice contains the following fixes:

    • Formula parsing and evaluation fixes [UO-49] Round of fuzzing and fixing issues. Added compilation and evaluation timeouts. Multiple fixes in various formula handlers, input and bounds checking to avoid crashing.
    • Address issues with images in spreadsheets [UO-71, UO-79, USD-47] Addresses schema issues that came up when images were added in spreadsheets.
    Source code(tar.gz)
    Source code(zip)
  • v1.7.0(Nov 9, 2020)

    This minor version release 1.7.0 of UniOffice contains the following fixes and enhancements:

    Fixes and enhancements

    • Support for merging docx documents [UO-19] Document now has an Append function to append another document to it Example for merging documents: https://github.com/unidoc/unioffice-examples/blob/master/document/merge-documents/main.go

    • Improved document header creation and modification support [UO-63] Fixes https://github.com/unidoc/unioffice/issues/405 New example for creating and updating header: https://github.com/unidoc/unioffice-examples/tree/master/document/doc-existing-header New example for creating header on even/odd pages: https://github.com/unidoc/unioffice-examples/tree/master/document/even-odd-header

    • Improved schema support for union functions [UO-54] Fixes https://github.com/unidoc/unioffice/issues/243

    • Fix to make document.RemoveParagraph remove from table cells also [UO-53] Fixes https://github.com/unidoc/unioffice/issues/412

    Source code(tar.gz)
    Source code(zip)
  • v1.6.0(Oct 13, 2020)

    This minor version release 1.6.0 of UniOffice contains the following fixes and enhancements:

    Fixes and enhancements

    Source code(tar.gz)
    Source code(zip)
  • v1.5.1(Aug 31, 2020)

  • v1.5.0(Aug 23, 2020)

    Version 1.5.0 has multiple enhancements and new features.

    Highlights

    New features include:

    Other:

    • Close function added in document, spreadsheet, presentation to properly clean up resources. All examples updated to include this.

    As well as various bugfixes.

    Note that UniOffice is a commercial library and requires an electronic license code to operate. A free trial can be obtained at https://unidoc.io/

    Source code(tar.gz)
    Source code(zip)
  • v1.4.0(Jun 30, 2020)

    Version 1.4.0 has multiple fixes and improvements.

    Highlights

    In particular there are improvements in Powerpoint presentation and creating pptx files from potx templates. In addition,

    • Improved support in Powerpoint presentation and creating pptx files from potx templates.
    • New example for creating powerpoint pptx presentations from templates with images: https://github.com/unidoc/unioffice/blob/master/_examples/presentation/use-template-with-image/main.go
    • Improved automatic build to run automatic validation of all input and output files to improve quality control and catch regressions.
    • Fixes in strict conformance document output and ability to control conformance level. Includes an example: https://github.com/unidoc/unioffice/blob/master/_examples/document/set-strict/main.go

    Pull requests

    • #414 New data and package names in schema (#414) (@zgordan-vv)
    • #413 dml struct names fix (#413) (@zgordan-vv)
    • #409 Presentation tests (#409) (@zgordan-vv)
    • #407 Resolve UO-27, UO-28, UO-31 (#407) (@zgordan-vv)
    • #406 Fixes in CI (#406) (@gunnsth)
    • #404 Validating examples (#404) (@zgordan-vv)
    • #395 PPT relationships, image fix (#395) (@zgordan-vv)
    • #400 SaveAsTemplate and SaveToFileAsTemplate are added for presentation (#400) (@zgordan-vv)
    • #398 error with incorrect content types names fixed when deleting slides a… (#398) (@zgordan-vv)
    • #399 document.SetConformance is added with an example (#399) (@zgordan-vv)
    • #397 Add nil checks to prevent crash UO-26 (#397) (@gunnsth)
    Source code(tar.gz)
    Source code(zip)
  • v1.3.1(May 18, 2020)

    This version contains bug fixes and a couple of new features.

    New features

    • High level functions for getting paragraph and run properties and style information
    • New play for extracting paragraph and run properties from table: https://play.unidoc.io/p/9f1ed9d356940989
    • New example for adding paragraphs before and after tables: https://github.com/unidoc/unioffice/blob/master/_examples/document/paragraphs_in_table/main.go
    • High level functions for getting endnotes and footnotes from documents
    • New example showcasing support for getting footnotes and endnotes: https://github.com/unidoc/unioffice/blob/master/_examples/document/endnotes_footnotes/main.go

    Pull requests merged

    • #392 paragraph and run properties (#392) (@zgordan-vv)
    • #387 Issue #385 fix (#387) (@zgordan-vv)
    • #384 numbering fix (#384) (@zgordan-vv)
    • #380 Fixes for being able to compile with playground (#380) (@zgordan-vv)
    • #377 Get all cells in a row with empty ones (#377) (@zgordan-vv)
    • #374 add the ability to utilize footnotes and endnotes in documents (#374) (@compleatang)
    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Feb 18, 2020)

    Highlights

    • Significantly enhanced support for Excel formula functions in spreadsheets
    • Enhanced formula parser and more functions implemented
    • Capability to remove columns with automatic reference updates
    • New example to flatten spreadsheets

    Pull requests involved:

    • #371 Spreadsheet: Remove columns feature (Issue #367) (#371) (@zgordan-vv)
    • #369 copying cell formats when flattening (#369) (@zgordan-vv)
    • #368 Flatten fixes (#368) (@zgordan-vv)
    • #366 Spreadsheet formulas: Flattening files (#366) (@zgordan-vv)
    • #363 Test cases for functions for https://github.com/unidoc/unioffice/issues/336 (#363) (@zgordan-vv)
    • #362 Financial functions: part 4 (#362) (@zgordan-vv)
    • #361 Financial functions: part 3 (#361) (@zgordan-vv)
    • #360 Financial functions: part 2 (#360) (@zgordan-vv)
    • #359 Financial functions - part 1 (#359) (@zgordan-vv)
    • #357 Test cases for SUM and IF (#357) (@zgordan-vv)
    • #354 TEXT (#354) (@zgordan-vv)
    • #353 Excel functions part 4 (#353) (@zgordan-vv)
    • #351 Excel spreadsheet functions, part 3 (#351) (@zgordan-vv)
    • #348 Functions2 (#348) (@zgordan-vv)
    • #345 Spreadsheed Formula Functions (#345) (@zgordan-vv)
    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Oct 23, 2019)

    This version contains bug fixes and a few new minor features.

    Fixes and enhancements.

    • #340 Optional custom.xml (#340) (@zgordan-vv)
    • #335 watermarks for spreadsheets and presentations (#335) (@zgordan-vv)
    • #330 #315, #323, #329 (#330) (@zgordan-vv)
    • #308 Refactor header/footer reusing document.tables (#308) (@5andr0)
    • #309 Support custom metadata in document properties (#309) (@zgordan-vv)
    • #311 issue with paragraph containing several form fields #305 fix (#311) (@zgordan-vv)
    • #306 add hh time format support (#306) (@lunny)
    • #295 Add images from bytes for presentation and workbook (#295) (@mec07)
    • #303 Add tables loop to Paragraphs funcs of Header and Footer (#303) (@nkryuchkov)
    • #291 Fix runtime panic when making a presentation from a template (#291) (@mec07)
    • #299 Fix comment on the AddSheet (#299) (@nkryuchkov)
    Source code(tar.gz)
    Source code(zip)
  • v1.2.0(Jun 13, 2019)

  • v1.1.0(May 27, 2019)

    Package renamed to unioffice and import path updated to github.com/unidoc/unioffice

    Feature Additions

    • Functions to copy and remove sheets (#281)
    • Add tables loop to document's Paragraphs func (#280)
    • Add purl.oclc.org namespace support (#265)
    • include CT_SdtRow entries in table.Rows() (#261)
    • include CT_SdtCell entries in row.Cells() (#260)
    • Included nested tables in document.Tables() output (#257)
    • Support fetching bookmarks within tables, including recursively (#255)
    • Image from data (#251)
    • Added a RemoveCalcChain function to remove the un-needed cached calculation chain (#215)
    • Added support for nested tables in documents (#221)
    • Document Set/Get MultiLevelType (#222 )
    • NumberingLevel now starts numbering at 1 by default, not 0. (#222 )
    • Support SetAlignment(), SetStartIndent() and SetHangingIndent() in ParagraphStyleProperties (#222 )
    • Add support for Run page breaks (#222 )
    • Support cell rotation in spreadsheets (#226)
    • Added color RGBA constructor (#235)

    Bug Fixes

    • Look for mail merge fields inside tables (#223)
    • Specify custom row height attribute to allow setting row heights (#232)
    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(Oct 14, 2018)

  • v1.0.0(Sep 28, 2018)

    As the code is now in production by several commercial customers, we're tagging a v1.0.0 release. We will be following standard Semantic Versioning from here on out for everything outside of the schema directory, but don't expect any major changes even in there. Thanks to everyone who has supported gooxml in it's first year!

    Source code(tar.gz)
    Source code(zip)
  • v0.9.2(Sep 28, 2018)

    What's New

    • document: Support for controlling table cell margins #202
    • spreadsheet: Support inserting rows within a sheet #203

    Bug Fixes

    • Fix bug in IsBold for runs with a bold property set to false #204
    Source code(tar.gz)
    Source code(zip)
Owner
UniDoc
PDF and Office (docx, xlsx, pptx) libraries for Golang
UniDoc
Golang library for reading and writing Microsoft Excel™ (XLSX) files.

Excelize Introduction Excelize is a library written in pure Go providing a set of functions that allow you to write to and read from XLSX / XLSM / XLT

360 Enterprise Security Group, Endpoint Security, inc. 13.8k Jan 5, 2023
Fast and reliable way to work with Microsoft Excel™ [xlsx] files in Golang

Xlsx2Go package main import ( "github.com/plandem/xlsx" "github.com/plandem/xlsx/format/conditional" "github.com/plandem/xlsx/format/conditional/r

Andrey G. 156 Dec 17, 2022
A simple and light excel file reader to read a standard excel as a table faster | 一个轻量级的Excel数据读取库,用一种更`关系数据库`的方式解析Excel。

Intro | 简介 Expect to create a reader library to read relate-db-like excel easily. Just like read a config. This library can read all xlsx file correct

Back Yu 167 Dec 19, 2022
Go (golang) library for reading and writing XLSX files.

XLSX Introduction xlsx is a library to simplify reading and writing the XML format used by recent version of Microsoft Excel in Go programs. Tutorial

Geoffrey J. Teale 5.4k Dec 28, 2022
go-eexcel implements encoding and decoding of XLSX like encoding/json

go-eexcel go-eexcel implements encoding and decoding of XLSX like encoding/json Usage func ExampleMarshal() { type st struct { Name string `eexce

sago35 0 Dec 9, 2021
Golang bindings for libxlsxwriter for writing XLSX files

goxlsxwriter provides Go bindings for the libxlsxwriter C library. Install goxlsxwriter requires the libxslxwriter library to be installe

Frank Terragna 20 Nov 18, 2022
Golang bindings for libxlsxwriter for writing XLSX files

goxlsxwriter goxlsxwriter provides Go bindings for the libxlsxwriter C library. Install goxlsxwriter requires the libxslxwriter library to be installe

Frank Terragna 730 May 30, 2021
A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats.

grate A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats. Why? Grate focuses on speed and stability first

Jeremy Jay 110 Dec 26, 2022
Cheap/fast/simple XLSX file writer for textual data

xlsxwriter Cheap/fast/simple XLSX file writer for textual data -- no fancy formatting or graphs go get github.com/mzimmerman/xlsxwriter data := [][]s

Matthew Zimmerman 0 Feb 8, 2022
一款 Go 语言编写的小巧、简洁、快速采集 fofa 数据导出到 Excel 表单的小工具。

fofa 一款 Go 语言编写的小巧、简洁、快速采集 fofa 数据导出到 Excel 表单的小工具。 Goroutine + retryablehttp Build git clone https://github.com/inspiringz/fofa cd fofa go build -ldf

3ND 24 Nov 9, 2022
A simple excel engine without ui to parse .csv files.

A simple excel engine without ui to parse .csv files.

Akmal Hossain 1 Nov 4, 2021
Go Microsoft Excel Number Format Parser

NFP (Number Format Parser) Using NFP (Number Format Parser) you can get an Abstract Syntax Tree (AST) from Excel number format expression. Installatio

null 9 Dec 2, 2022
Using NFP (Number Format Parser) you can get an Abstract Syntax Tree (AST) from Excel number format expression

NFP (Number Format Parser) Using NFP (Number Format Parser) you can get an Abstract Syntax Tree (AST) from Excel number format expression. Installatio

fossabot 0 Feb 4, 2022
Fastq demultiplexer for single cell data from MGI sequencer (10x converted library).

fastq_demultiplexer Converts fastq single cell data from MGI (10x converted library) to Illumina compatible format. Installation go install github.com

Rostislav Vorobev 0 Nov 24, 2021
Смена автора в программах Microsoft Office (Word, Ecxel, PowerPoint) на случай если твой препод палит лабы по автору документа

AuthorChanger This program helps you to change Microsoft Office 2013-2019 document author. Works with MS Word, MS Excel, MS PowerPoint. Usage Clone a

КОСТЫЛЬЩИК 1 Dec 31, 2021
word2text - a tool is to convert word documents (DocX) to text on the CLI with zero dependencies for free

This tool is to convert word documents (DocX) to text on the CLI with zero dependencies for free. This tool has been tested on: - Linux 32bit and 64 bit - Windows 32 bit and 64 bit - OpenBSD 64 bit

Ryan Thomas 5 Apr 19, 2021
Simple .docx converter implemented by Go. Convert .docx to plain text.

docc Simple ".docx" converter implemented by Go. Convert ".docx" to plain text. License MIT Features Less dependency. No need for Microsoft Office. On

tenkoh 2 Mar 30, 2022
golang 在线预览word,excel,pdf,MarkDown(Online Preview Word,Excel,PPT,PDF,Image by Golang)

Go View File 在线体验地址 http://39.97.98.75:8082/view/upload (不会经常更新,保留最基本的预览功能。服务器配置较低,如果出现链接超时请等待几秒刷新重试,或者换Chrome) 目前已经完成 docker部署 (不用为运行环境烦恼) Wor

CZC 78 Dec 26, 2022
Golang wrapper for Exiftool : extract as much metadata as possible (EXIF, ...) from files (pictures, pdf, office documents, ...)

go-exiftool go-exiftool is a golang library that wraps ExifTool. ExifTool's purpose is to extract as much metadata as possible (EXIF, IPTC, XMP, GPS,

null 154 Dec 28, 2022
Golang library for reading and writing Microsoft Excel™ (XLSX) files.

Excelize Introduction Excelize is a library written in pure Go providing a set of functions that allow you to write to and read from XLSX / XLSM / XLT

360 Enterprise Security Group, Endpoint Security, inc. 13.9k Jan 9, 2023