#2 new
Hector E. Gomez Morales

Draft of Proposal v0.1

Reported by Hector E. Gomez Morales | March 30th, 2009 @ 08:25 AM | in Proposal Deadline

Tell us a little about yourself.

I am a last year CS student from Mexico City (UNAM). I was a .Net(Mono) developer before I was enchanted by ruby and later rails.

Why do you use Rails? How would you like to see it improve?

I have been using rails since 1.2 release for personal, work and institutional projects. Before Rails I felt very uncomfortable with all other web frameworks. then during a project for my university the development team lead decided to use rails. The project was a huge success, so after that I began to heavily use rails for my web projects. I like the no-nonsense, developer-friendly and fun approach to rails development while at the same inspiring to use good practices like: TDD, BDD, DRY, etc.

Ruby18 multi-byte character support is weak at best and being a non-english speaker this made pretty difficult some tasks in rails. Now with ruby 1.9 this support is greatly improved but brings other set of problems. Now any ruby developer has to be aware about:

  • The encoding of his source code.
  • The encoding of this environment that defines the default_external encoding.
  • The encoding of the IO operations.
  • Optional: Definition of a default_internal encoding for automatic transcoding of IO operations.

Right now all this decisions await a ruby 1.9 developer but in the general case a rails developer would expect:

  • A default encoding that services the largest set of people and data. For this UTF-8 is widely supported, used and accepted.
  • Automatic transcoding from other encodings to the default (UTF-8) if possible.
  • Expect all this to work out-of-the-box.

I will like to improve rails using the functionality provided by ruby 1.9 and have a top notch encoding support that just works.

Why is this important to the Rails community at large? Why is this important to you?

Even for English speakers a good encoding support is needed to permit the handling of sources with different encodings like email, user input, etc.

This is important because while great part of the content is in English(ASCII, UTF8) there is huge pool of users and developers that aren't native english speakers that expect a good support for the encoding they use.

I want to help to lower the pain that is working with different encodings right now. And this will help me to develop applications that are compatible with my native language encoding.

List a clear set of goals/milestones you'll hit during the summer. Be specific.

List of features(goals):

  • Implementation of a standalone library with codename ActiveEncoding.
    • Implementation will only focus in compatibility with ruby 1.9.
    • Rails stays working as usual with ruby 1.8.
  • Application wide configuration of encoding.
  • Following the Convention over Configuration paradigm, set the default to be:
    • UTF8 as default encoding
    • Transparent transcoding of any input string with encoding different to UTF-8.
  • The concept of levels of strictness in the transcoding of strings:
    • Strict: Any non-valid transcode raises an exception.
    • Transcode: Attempts to do a full transcode of string. If a valid transcode is not entirely possible a partial transcode is done.
    • Ignore: Ignores string and returns a canned string.
        config.active_encoding.handle_incompatible_characters = :strict
        config.active_encoding.handle_incompatible_characters = :transcode  
        config.active_encoding.handle_incompatible_characters = :ignore  
      
      
  • Logging of the line, source, string, etc involved in the event of an invalid transcode is enabled by default for all levels.
  • This will touch all the components in the framework. With 3 months of development I think there is a need to focus to only a subset of all the framework
    • Full integration to controller views (ActionPack).
    • If there is surplus time begin the incorporation of the functionality to the other components in Rails.

Give a rough timeline for hitting these milestones.

Community Bonding: April 20th - May 23th

  • Work in encoding and actionpack related bugs, to get familiar with rails code.

Note: I have started working in bugs: #2188, #1988 I have attached a workaround patch while I get a good solution for the bugs.

Iteration 1: May 23 - May 29

  • Setup project layout, build tools, etc.

Iteration 2: May 30 - June 5

  • Implementation of transparent transcode feature (with the :transcode level in mind).

Iteration 3: June 6 - June 12

  • Finish last bits of transparent trancode feature.
  • Begin implementation of the levels of strictness.

Iteration 4: June 13 - June 19

  • Finish implementation of all the levels
  • Implementation of logging feature.

Iteration 5: June 20 - June 26

  • Begin implementation of application wide configuration feature.

Iteration 6: June 27 - July 3

  • Finish implementation of application wide configuration feature.

Iteration 7 (Mid Milestone) July 4 - July 10

  • First release of ActiveEncoding with proposed features.

Iteration 8: July 11 - July 17

  • Begin integration of ActiveEncoding to ActionView.

Iteration 9: July 18 - July 24

Iteration 10: July 25 - July 31

Iteration 11 (Final Milestone): August 1 - August 10

  • Finish integration of ActiveEncoding to ActionView.

Iteration 12: August 11 - August 17

  • Final Touches - Documentation, Tests, etc.

How will you measure progress? How will you handle falling behind?

My plan is to work in iterations of one week of duration, this way I can get a good feeling of the work that I have finished and the work to be done. This also will force me to release often and get feedback about the general direction of the implementation. If I begin to fall behind I will have to re-estimate the work to be done then I would need to re-prioritize the features to be completed and in the worst case decide if I have to drop any non-crucial feature so that I can pull the main objective forward.

What are the "unknowns" in this project for you? What kind of pitfalls could you run into?

While we finally have a stable release of 1.9 series we could get some more changes in the API and implementation of Encoding and String. This will affect ActiveEncoding directly but I expect there are only minor changes if any.

I will like to thank Manfred Stienstra for all the help for the making of this proposal.

No comments found

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

Proposal for the implementation of ActiveEncoding library that will make transparent the manipulation of string of different encodings (compatible and not compatible).

People watching this ticket

Pages