Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,15 @@ PATH
specs:
sablon (0.0.21)
nokogiri (>= 1.6.0)
rubyzip (>= 1.1)
rubyzip (>= 1.1.1)

GEM
remote: https://rubygems.org/
specs:
mini_portile2 (2.1.0)
mini_portile2 (2.2.0)
minitest (5.8.0)
nokogiri (1.7.1)
mini_portile2 (~> 2.1.0)
nokogiri (1.8.0)
mini_portile2 (~> 2.2.0)
rake (10.4.2)
rubyzip (1.2.1)
xml-simple (1.1.5)
Expand All @@ -27,4 +27,4 @@ DEPENDENCIES
xml-simple

BUNDLED WITH
1.14.5
1.14.6
131 changes: 120 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,28 @@ and efficient.

*Note: Sablon is still in early development. Please report if you encounter any issues along the way.*

#### Table of Contents
* [Installation](#installation)
* [Usage](#usage)
* [Writing Templates](#writing-templates)
* [Content Insertion](#content-insertion)
* [WordProcessingML](#wordprocessingml)
* [HTML](#html)
* [Conditionals](#conditionals)
* [Loops](#loops)
* [Nesting](#nesting)
* [Comments](#comments)
* [Configuration (Beta)](#configuration-beta)
* [Customizing HTML Tag Conversion](#customizing-html-tag-conversion)
* [Customizing CSS Style Conversion](#customizing-css-style-conversion)
* [Executable](#executable)
* [Examples](#examples)
* [Using a Ruby script](#using-a-ruby-script)
* [Using the sablon executable](#using-the-sablon-executable)
* [Contributing](#contributing)
* [Inspiration](#inspiration)


## Installation

Add this line to your application's Gemfile:
Expand Down Expand Up @@ -126,14 +148,13 @@ For a complete example see the test file located on `test/image_test.rb`.

This functionality was inspired in the [kubido fork](https://github.com/kubido/sablon) for this project - kubido/sablon

##### HTML [experimental]
##### HTML

Similar to WordProcessingML it's possible to use html as input while processing the
tempalte. You don't need to modify your templates, a simple insertion operation
Similar to WordProcessingML it's possible to use html as input while processing the template. You don't need to modify your templates, a simple insertion operation
is sufficient:

```
«=article.body»
«=article»
```

To use HTML insertion prepare the context like so:
Expand All @@ -142,24 +163,40 @@ To use HTML insertion prepare the context like so:
html_body = <<-HTML
<div>This text can contain <em>additional formatting</em>
according to the <strong>HTML</strong> specification.</div>
<p style="text-align: right; background-color: #FFFF00">Right aligned
content with a yellow background color</p>
<div><span style="color: #123456">Inline styles</span> are possible as well</div>
HTML
context = {
article: { html_body: Sablon.content(:html, html_body) }
article: Sablon.content(:html, html_body) }
# alternative method using special key format
# 'html:article' => html_body
}
template.render_to_file File.expand_path("~/Desktop/output.docx"), context
```

Currently HTML insertion is very limited and strongly focused on the HTML
generated by [Trix editor](https://github.com/basecamp/trix).
Currently, HTML insertion is somewhat limited. It is recommended that the block level tags such as `p` and `div` are not nested within each other, otherwise the final document may not generate as anticipated. List tags (`ul` and `ol`) and inline tags (`span`, `b`, `em`, etc.) can be nested as deeply as needed.

IMPORTANT: This feature is very much *experimental*. Currently, the insertion
will replace the containing paragraph. This means that other content in the same
paragraph is discarded.
Not all tags are supported. Currently supported tags are defined in [configuration.rb](lib/sablon/configuration/configuration.rb) for paragraphs in method `prepare_paragraph` and for text runs in `prepare_run`.

Basic conversion of CSS inline styles into matching WordML properties in supported through the `style=" ... "` attribute in the HTML markup. Not all possible styles are supported and only a small subset of CSS styles have a direct WordML equivalent. Styles are passed onto nested elements. The currently supported styles are also defined in [configuration.rb](lib/sablon/configuration/configuration.rb) in method `process_style`. Simple toggle properties that aren't directly supported can be added using the `text-decoration: ` style attribute with the proper WordML tag name as the value. Paragraph and Run property reference can be found at:
* http://officeopenxml.com/WPparagraphProperties.php
* http://officeopenxml.com/WPtextFormatting.php

If you wish to write out your HTML code in an indented human readable fashion, or you are pulling content from the ERB templating engine in rails the following regular expression can help eliminate extraneous whitespace in the final document.
```ruby
# combine all white space
html_str = html_str.gsub(/\s+/, ' ')
# clear any white space between block level tags and other content
html_str.gsub(%r{\s*<(/?(?:h\d|div|p|br|ul|ol|li).*?)>\s*}, '<\1>')
```

IMPORTANT: Currently, the insertion will replace the containing paragraph. This means that other content in the same paragraph is discarded.


#### Conditionals

Sablon can render parts of the template conditonally based on the value of a
Sablon can render parts of the template conditionally based on the value of a
context variable. Conditional fields are inserted around the content.

```
Expand Down Expand Up @@ -213,6 +250,78 @@ styles for HTML insertion.
«endComment»
```

### Configuration (Beta)

The Sablon::Configuration singleton is a new feature that allows the end user to customize HTML parsing to their needs without needing to fork and edit the source code of the gem. This API is still in a beta state and may be subject to change as future needs are identified beyond HTML conversion.

The example below show how to expose the configuration instance:
```ruby
Sablon.configure do |config|
# manipulate config object
end
```

The default set of registered HTML tags and CSS property conversions are defined in [configuration.rb](lib/sablon/configuration/configuration.rb).

#### Customizing HTML Tag Conversion

Any HTML tag can be added using the configuration object even if it needs a custom AST class to handle conversion logic. Simple inline tags that only modify the style of text (i.e. the already supported `<b>` tag) can be added without an AST class as shown below:
```ruby
Sablon.configure do |config|
config.register_html_tag(:bgcyan, :inline, properties: { highlight: 'cyan' })
end
```
The above tag simply adds a background color to text using the `<w:highlight w:val="cyan" />` property.


More complex business logic can be supported by adding a new class under the `Sablon::HTMLConverter` namespace. The new class will likely subclass `Sablon::HTMLConverter::Node` or `Sablon::HTMLConverter::Collection` depending on the needed behavior. The current AST classes serve as additional examples and can be found in [ast.rb](/lib/sablon/html/ast.rb). When registering a new HTML tag that uses a custom AST class the class must be passed in either by name using a lowercased and underscored symbol or the class object itself.

The block below shows how to register a new HTML tag that adds the following AST class: `Sablon::HTMLConverter::InstrText`.
```ruby
module Sablon
class HTMLConverter
class InstrText < Node
# implementation details ...
end
end
end
# register tag
Sablon.configure do |config|
config.register_html_tag(:bgcyan, :inline, ast_class: :instr_text)
end
```

Existing tags can be overwritten using the `config.register_html_tag` method or removed entirely using `config.remove_html_tag`.
```ruby
# remove tag
Sablon.configure do |config|
# remove support for the span tag
config.remove_html_tag(:span)
end
```


#### Customizing CSS Style Conversion

The conversion of CSS stored in an element's `style="..."` attribute can be customized using the configuration object as well. Adding a new style conversion or overriding an existing one is done using the `config.register_style_converter` method. It accepts three arguments the name of the AST node (as a lowercased and underscored symbol) the style applies to, the name of the CSS property (needs to be a string in most cases) and a lambda that accepts a single argument, the property value. The example below shows how to add a new style that sets the `<w:highlight />` property.
```ruby
# add style conversion
Sablon.configure do |config|
# register new conversion for the Sablon::HTMLConverter::Run AST class.
converter = lambda { |v| return 'highlight', v }
config.register_style_converter(:run, 'custom-highlight', converter)
end
```

Existing conversions can be overwritten using the `config.register_style_converter` method or removed entirely using `config.remove_style_converter`.
```ruby
# remove tag
Sablon.configure do |config|
# remove support for conversion of font-size for the Run AST class
config.remove_style_converter(:run, 'font-size')
end
```

### Executable

The `sablon` executable can be used to process templates on the command-line.
Expand Down
7 changes: 7 additions & 0 deletions lib/sablon.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
require 'zip'
require 'nokogiri'
require 'open-uri'

require "sablon/version"
require "sablon/configuration/configuration"

require "sablon/numbering"
require "sablon/images"
require "sablon/context"
Expand All @@ -21,6 +24,10 @@ module Sablon
class TemplateError < ArgumentError; end
class ContextError < ArgumentError; end

def self.configure
yield(Configuration.instance) if block_given?
end

def self.template(path)
Template.new(path)
end
Expand Down
165 changes: 165 additions & 0 deletions lib/sablon/configuration/configuration.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
require 'singleton'
require 'sablon/configuration/html_tag'

module Sablon
# Handles storing configuration data for the sablon module
class Configuration
include Singleton

attr_accessor :permitted_html_tags, :defined_style_conversions

def initialize
initialize_html_tags
initialize_css_style_conversion
end

# Adds a new tag to the permitted tags hash or replaces an existing one
def register_html_tag(tag_name, type = :inline, **options)
tag = HTMLTag.new(tag_name, type, **options)
@permitted_html_tags[tag.name] = tag
end

# Removes a tag from the permitted tgs hash, returning it
def remove_html_tag(tag_name)
@permitted_html_tags.delete(tag_name)
end

# Adds a new style property converter for the specified ast class and
# CSS property name. The ast_class variable should be the class name
# in lowercased snakecase as a symbol, i.e. MyClass -> :my_class.
# The converter passed in must be a proc that accepts
# a single argument (the value) and returns two values: the WordML property
# name and its value. The converted property value can be a string, hash
# or array.
def register_style_converter(ast_node, prop_name, converter)
# create a new ast node hash if needed
unless @defined_style_conversions[ast_node]
@defined_style_conversions[ast_node] = {}
end
# add the style converter to the node's hash
@defined_style_conversions[ast_node][prop_name] = converter
end

# Deletes a CSS converter from the hash by specifying the AST class
# in lowercased snake case and the property name.
def remove_style_converter(ast_node, prop_name)
@defined_style_conversions[ast_node].delete(prop_name)
end

private

# Defines all of the initial HTML tags to be used by HTMLconverter
def initialize_html_tags
@permitted_html_tags = {}
tags = {
# special tag used for elements with no parent, i.e. top level
'#document-fragment' => { type: :block, ast_class: :root, allowed_children: :_block },

# block level tags
div: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Normal' }, allowed_children: :_inline },
p: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Paragraph' }, allowed_children: :_inline },
h1: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Heading1' }, allowed_children: :_inline },
h2: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Heading2' }, allowed_children: :_inline },
h3: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Heading3' }, allowed_children: :_inline },
h4: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Heading4' }, allowed_children: :_inline },
h5: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Heading5' }, allowed_children: :_inline },
h6: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Heading6' }, allowed_children: :_inline },
ol: { type: :block, ast_class: :list, properties: { pStyle: 'ListNumber' }, allowed_children: %i[ol li] },
ul: { type: :block, ast_class: :list, properties: { pStyle: 'ListBullet' }, allowed_children: %i[ul li] },
li: { type: :block, ast_class: :list_paragraph },

# inline style tags
span: { type: :inline, ast_class: nil, properties: {} },
strong: { type: :inline, ast_class: nil, properties: { b: nil } },
b: { type: :inline, ast_class: nil, properties: { b: nil } },
em: { type: :inline, ast_class: nil, properties: { i: nil } },
i: { type: :inline, ast_class: nil, properties: { i: nil } },
u: { type: :inline, ast_class: nil, properties: { u: 'single' } },
s: { type: :inline, ast_class: nil, properties: { strike: 'true' } },
sub: { type: :inline, ast_class: nil, properties: { vertAlign: 'subscript' } },
sup: { type: :inline, ast_class: nil, properties: { vertAlign: 'superscript' } },

# inline content tags
text: { type: :inline, ast_class: :run, properties: {}, allowed_children: [] },
br: { type: :inline, ast_class: :newline, properties: {}, allowed_children: [] }
}
# add all tags to the config object
tags.each do |tag_name, settings|
type = settings.delete(:type)
register_html_tag(tag_name, type, **settings)
end
end

# Defines an initial set of CSS -> WordML conversion lambdas stored in
# a nested hash structure where the first key is the AST class and the
# second is the conversion lambda
def initialize_css_style_conversion
@defined_style_conversions = {
# styles shared or common logic across all node types go here.
# Special conversion lambdas such as :_border can be
# defined here for reuse across several AST nodes. Care must
# be taken to avoid possible naming conflicts, hence the underscore.
# AST class keys should be stored with their names converted from
# camelcase to lowercased snakecase, i.e. TestCase = test_case
node: {
'background-color' => lambda { |v|
return 'shd', { val: 'clear', fill: v.delete('#') }
},
_border: lambda { |v|
props = { sz: 2, val: 'single', color: '000000' }
vals = v.split
vals[1] = 'single' if vals[1] == 'solid'
#
props[:sz] = @defined_style_conversions[:node][:_sz].call(vals[0])
props[:val] = vals[1] if vals[1]
props[:color] = vals[2].delete('#') if vals[2]
#
return props
},
_sz: lambda { |v|
return nil unless v
(2 * Float(v.gsub(/[^\d.]/, '')).ceil).to_s
},
'text-align' => ->(v) { return 'jc', v }
},
# Styles specific to the Paragraph AST class
paragraph: {
'border' => lambda { |v|
props = @defined_style_conversions[:node][:_border].call(v)
#
return 'pBdr', [
{ top: props }, { bottom: props },
{ left: props }, { right: props }
]
},
'vertical-align' => ->(v) { return 'textAlignment', v }
},
# Styles specific to a run of text
run: {
'color' => ->(v) { return 'color', v.delete('#') },
'font-size' => lambda { |v|
return 'sz', @defined_style_conversions[:node][:_sz].call(v)
},
'font-style' => lambda { |v|
return 'b', nil if v =~ /bold/
return 'i', nil if v =~ /italic/
},
'font-weight' => ->(v) { return 'b', nil if v =~ /bold/ },
'text-decoration' => lambda { |v|
supported = %w[line-through underline]
props = v.split
return props[0], 'true' unless supported.include? props[0]
return 'strike', 'true' if props[0] == 'line-through'
return 'u', 'single' if props.length == 1
return 'u', { val: props[1], color: 'auto' } if props.length == 2
return 'u', { val: props[1], color: props[2].delete('#') }
},
'vertical-align' => lambda { |v|
return 'vertAlign', 'subscript' if v =~ /sub/
return 'vertAlign', 'superscript' if v =~ /super/
}
}
}
end
end
end
Loading