The Artima Developer Community
Sponsored Link

Ruby Buzz Forum
A use of Enumerable#chunk

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Eric Hodel

Posts: 660
Nickname: drbrain
Registered: Mar, 2006

Eric Hodel is a long-time Rubyist and co-founder of Seattle.rb.
A use of Enumerable#chunk Posted: Mar 16, 2012 4:34 PM
Reply to this message Reply

This post originated from an RSS feed registered with Ruby Buzz by Eric Hodel.
Original Post: A use of Enumerable#chunk
Feed Title: Segment7
Feed URL: http://blog.segment7.net/articles.rss
Feed Description: Posts about and around Ruby, MetaRuby, ruby2c, ZenTest and work at The Robot Co-op.
Latest Ruby Buzz Posts
Latest Ruby Buzz Posts by Eric Hodel
Latest Posts From Segment7

Advertisement

In Ruby 1.9, Enumerable has a few new methods including Enumerable#chunk (which was added for 1.9.2). The #chunk method walks your Enumerable and divides it into chunks based on a selecting block. Unlike Enumerable#partition, the chunks are returned in-order. Here's an example from the documentation:

[3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5].chunk { |n|
  n.even?
}.each { |even, ary|
  p [even, ary]
}
#=> [false, [3, 1]]
#   [true, [4]]
#   [false, [1, 5, 9]]
#   [true, [2, 6]]
#   [false, [5, 3, 5]]

When I first saw this method I thought, "this looks like a useful method… but how?"

I'm working on bringing Markdown support to RDoc and the last remaining base Markdown feature I need to support is a hard break due to two spaces at the end of a line in a paragraph.

For background, RDoc parses various formats into a common syntax tree which is can then be transformed for any supported output (such as HTML, colored ANSI text, etc.). In this syntax tree a paragraph can contain one or more strings which are joined at output time into the paragraph you see.

To add hard line breaks, I decided to create a new HardBreak object and inject it into the paragraph where two trailing spaces are encountered in the source document. The formatters can then be updated to insert the appropriate line break character when emitting a paragraph.

Enumerable#chunk comes in because the Markdown parser doesn't join strings as it's parsing (since the grammar rules get re-used) and is instead performed as a post-processing step. (String joining as a post-processing step also makes the parser cleaner by hiding the ugliness in one spot rather than spreading it across multiple grammar rules.) Before inserting HardBreak objects this was sufficient:

parts = paragraph.parts.join.rstrip
paragraph.parts.replace [parts]

But now I need to join String chunks and include HardBreaks as-is which is a perfect use of Enumerable#chunk:

parts = paragraph.parts.chunk do |part|
  String === part
end.map do |string, chunk|
  string ? chunk.join.rstrip : chunk
end.flatten

paragraph.parts.replace parts

The 1.8-compatible implementation is much uglier since I have to track whether I'm in a String chunk or not in addition to performing the processing. I'm too embarrassed to post it, but you'll be able to find it in the rdoc source once I commit and push it.

Read: A use of Enumerable#chunk

Topic: How I built libyaml on Windows with MSVC++ Previous Topic   Next Topic Topic: What do you want from Sencha Touch 2.1?

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use