Ruby Buzz Forum - A use of Enumerable#chunk

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Ruby Buzz Forum
A use of Enumerable#chunk

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Eric Hodel

Posts: 660
Nickname: drbrain
Registered: Mar, 2006

Eric Hodel is a long-time Rubyist and co-founder of Seattle.rb.

A use of Enumerable#chunk

Posted: Mar 16, 2012 4:34 PM

This post originated from an RSS feed registered with Ruby Buzz by Eric Hodel.
Original Post: A use of Enumerable#chunk Feed Title: Segment7 Feed URL: http://blog.segment7.net/articles.rss Feed Description: Posts about and around Ruby, MetaRuby, ruby2c, ZenTest and work at The Robot Co-op.	Latest Ruby Buzz Posts Latest Ruby Buzz Posts by Eric Hodel Latest Posts From Segment7

In Ruby 1.9, Enumerable has a few new methods including Enumerable#chunk (which was added for 1.9.2). The #chunk method walks your Enumerable and divides it into chunks based on a selecting block. Unlike Enumerable#partition, the chunks are returned in-order. Here's an example from the documentation:

[3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5].chunk { |n|
  n.even?
}.each { |even, ary|
  p [even, ary]
}
#=> [false, [3, 1]]
#   [true, [4]]
#   [false, [1, 5, 9]]
#   [true, [2, 6]]
#   [false, [5, 3, 5]]

When I first saw this method I thought, "this looks like a useful method… but how?"

I'm working on bringing Markdown support to RDoc and the last remaining base Markdown feature I need to support is a hard break due to two spaces at the end of a line in a paragraph.

For background, RDoc parses various formats into a common syntax tree which is can then be transformed for any supported output (such as HTML, colored ANSI text, etc.). In this syntax tree a paragraph can contain one or more strings which are joined at output time into the paragraph you see.

To add hard line breaks, I decided to create a new HardBreak object and inject it into the paragraph where two trailing spaces are encountered in the source document. The formatters can then be updated to insert the appropriate line break character when emitting a paragraph.

Enumerable#chunk comes in because the Markdown parser doesn't join strings as it's parsing (since the grammar rules get re-used) and is instead performed as a post-processing step. (String joining as a post-processing step also makes the parser cleaner by hiding the ugliness in one spot rather than spreading it across multiple grammar rules.) Before inserting HardBreak objects this was sufficient:

parts = paragraph.parts.join.rstrip
paragraph.parts.replace [parts]

But now I need to join String chunks and include HardBreaks as-is which is a perfect use of Enumerable#chunk:

parts = paragraph.parts.chunk do |part|
  String === part
end.map do |string, chunk|
  string ? chunk.join.rstrip : chunk
end.flatten

paragraph.parts.replace parts

The 1.8-compatible implementation is much uglier since I have to track whether I'm in a String chunk or not in addition to performing the processing. I'm too embarrassed to post it, but you'll be able to find it in the rdoc source once I commit and push it.

Read: A use of Enumerable#chunk

Previous Topic

Next Topic


	Web Artima.com