login

LocalVariablesAndBlocks (Ruby)

HomePage | RecentChanges | Preferences | Wikis | RubyGarden | Feed-icon-16x16

Feel free to edit this!

Current situation

RubyTalk:64671 (Guy Decoux)

 What you call a "block parameter" don't exist, this is is *just* an
 assigned variable : understand this and you'll understand the rules used
 by ruby actually.

Guy Decoux

See an expanded description of the /CurrentBehaviour

Some history

matz wrote

Originally, the "|...|" used to be a simple iteration variable
specifier.  But when Proc (and closure) was introduced to Ruby (back
in Ruby 0.60 in 1994), I needed block local variables, so I made the
current local variable scoping rule, which is "when a new local
variable appears first in a block, it is valid only in the block".

And *I was wrong*.  This rule is the single biggest design flaw in
Ruby.  You have to care about local variable name conflict, and you
will have totally different result (without error) if they conflict.

So, we are talking about a part of many-year-long effort of fixing
this flaw.

Constraints

RubyTalk:42266 (matz)

We have following constraints:

 * the fix should not cause fatal compatibility problem.  It may not
   cause any problem, or at least may not cause any problem which
   cannot be fixed by the filter program.

 * the fix need not to cover arbitrary combination of local/non-local
   variables in block parameter, so that you don't need the one like

     |a,{b},c|        # where a goes block-local

   as someone proposed before.  We have two major usage of block
   parameters, a) iterator variables b) closure arguments.  The
   current block parameters are suitable (and designed for) a.  I
   guess we need a new notation designed for b too.

 * preferably, the above *bad* scoping rule should be removed from the
   language in the future, so that I guess we need a new notation to
   *declare* in-block variables.

 * Ruby does not like explicit variable declaration (except method
   arguments, of course), so that a new in-block variable declaration
   is better not be explicit, if possible.

Proposed modifications

matz's original idea

RubyTalk:52440 (matz)

 The "solution" changes time to time in my mind.  Currently I'm
thinking of the following:

  * variables assigned by ":=" should be local to the block (or
    possibly local to the nearest compound statements).

  * if ":=" assignee is already defined in outer scope, it should be
    warned (no -w needed, probably), and outer variable is shadowed
    hereafter.

  * all local variables in block parameters (e.g. var in |var|) should
    be treated as if they are assigned by ":=".  other types of
    variables in block parameter should be warned.

  * scope of local variables assigned by "=" will be a nearest "body"
    (method body, class body etc.) consistently.

This does not change the appearance of Ruby code much, unlke <var>
solution.  It is incompatible to the current syntax, for example,

   a = 5
   [1,2,3].each{|a| break if a % 2 == 0}
   p a

prints "2" now.  It will print "5" (with shadowing warning) if we
adopt the changes above.  Since it is not compatible, it will not be
available in the near future.  Perhaps you have to wait until Rite.

							matz.

Latest proposal

RubyTalk:63100 (matz)

In message "Re: Local variables & blocks"
    on 03/01/29, Mauricio Fernández <batsman.geo@yahoo.com> writes:

|To summarize:
| '=' refers to variables in the current method / class scope.
| ':=' refers to local vars
| block arguments are assigned with ':='
| in case of shadowing a warning is issued.

I may drop ':=' part, i.e.

  * block parameters are local to the block
  * shadowing cause warnings
  * no other way to make block local variables

                                                        matz.

See a summary of the /ProposedBehaviour

What's wrong with the /CurrentBehaviour?

RubyTalk:63509 (matz)

|I just re-read the Pickaxe section on threads, and yes I agree - and I also
|think it's a mess. When you assign to 'x' within a block, it has two
|entirely different behaviours, depending on whether 'x' was in existence
|previously or not.

I agree with you.  I was insane when I designed this rule.  That's why
I propose the change.  But I don't think we can just drop block local
variables like yours, since we have Proc (some kind of anonymous
function), we need some kind of local variables of their own.

How can it be improved?

RubyTalk:64670 (Brian Candler)

I think it's a barrier to newcomers that "a=0" has various different
behaviours depending on where you put it and whether or not you had a
previous assignment to "a" elsewhere in your method. I understand it now,
but it took a while.

Furthermore, your code may sometimes *rely* on variables being block-local,
but you have no way to signal your intention other than the *absence* of an
assignment outside of the block... documentation by omission :-) I end up
using bizarre names like "thread_i" to indicate this.

Essentially, we have four different behaviours now, all of which are useful,
but all of which are implied:

(1) 'a' (assigned variable) is bound to the method's local variables

  def method
    a = 0
    [1,2,3].each {|p|
      a = p             <<<
    }
    puts a
  end

(2) 'a' is local to the block

  def method
    [1,2,3].each {|p|
      a = p             <<<
    }
  end

(3) 'p' (block parameter) is bound to the method's local variables

  def method
    p = nil
    [1,2,3].each {|p|   <<<
      a = p
    }
    puts p
  end

(4) 'p' is local to the block

   def method
     [1,2,3].each {|p|  <<<
       a = p
     }
   end

And actually cases (1) and (3) have variants where they bind to the
variables of an enclosing block, as opposed to the method itself:

(1a) 'a' is bound to an enclosing block

  def method
    [1,2,3].each {|p|
      a = nil           <<< a is local to outer block
      [4,5,6].each {|q|
        a = q           <<< this is the same 'a' in outer block
      }
      puts a
    }
  end

(3a) 'q' is bound to an enclosing block

  def method
    [1,2,3].each {|p|
      q = nil
      [4,5,6].each {|q| <<< bound to 'q' in outer block
        a = q
      }
      puts q
    }
  end

These last two really do seem to be different, because a method is not the
same as a Proc object, and so a method local variable is not the same as a
block local variable.

I think all the above cases will appear in real programs, except IMO cases
(3) and (3a) are ugly hacks which nobody should be allowed to use :-) That
is, parameters to a block should be like the formal parameters to a method,
which are always local. That is acknowledged by Matz in the proposed New
Rules.

But apart from that, we still have to signal to the interpreter whether we
want 'a' or 'p' to be local or bound to enclosing scope, and whatever you do
that's going to involve syntax rules. Let me try to summarise:

Current rules
=============

'a' bound       previous assignment to 'a'
'a' local       no previous assignment to 'a' before block
'|p|' bound     previous assignment to 'p'
'|p|' local     no previous assignment to 'p' before block

Proposed 'New Rules'
====================

'a' bound       the default*
'a' local       not available, use |a| in a fake block, e.g. local {|a| ...}
'|p|' bound     not available, use '|q| p=q'
'|p|' local     the default

*if block is nested within another block, not clear whether 'a' binds to the
entire method or to the enclosing block; this may still depend on where 'a'
was previously assigned to.

Any other set of rules is going to have to have syntax for each of these
cases, and you just choose your sugar to taste. Many have been suggested.
e.g.

Everything defaults to bound to method
======================================

'a' bound       a=0
'a' local       my a=0          or      %a=0
'|p|' bound     |p|
'|p|' local     |my p|          or      |%p|

Everything defaults to block-local
==================================

'a' bound       our a=0         or      { |p| <a> ... }
'a' local       a=0
'|p|' bound     |our p|
'|p|' local     |p|

Matz doesn't want any explicit declarations, so basically the new rules
forbid two behaviours completely ('a' local and '|p|' bound), with
workarounds if you need those behaviours.

'|p|' bound is not very useful anyway, so your fundamental problem boils
down to how to choose between the two behaviours for local variables:

   max = 0
   obj.each {|i|
     max = i if i > max   # max must be BOUND or this doesn't work
   }
   puts max

   Thread.new {
     tmp = Obj.new        # tmp must be LOCAL or this doesn't work
     tmp.doit
   }

Other proposals

Things people thought of in the past. If you reinvent them please browse the mailing list archives to see if there's something new in your proposal.

/OtherProposals


HomePage | RecentChanges | Preferences | Wikis | RubyGarden
Edit text of this page | View other revisions
Rev 11, Last edited at July 25, 2004 23:24 pm by rgDazButcher / 213.249.213.194 (diff)
Find: