set_trace_func 和骗局

本文是对 http://t-a-w.blogspot.jp/2007/04/settracefunc-smoke-and-mirrors.html 的粗略翻译，如有错误请务必指正。

Ruby 解释器提供了可用于追踪 Ruby 执行的事件钩子。设置钩子用 set_trace_func(some_proc)；卸载钩子则用 set_trace_func(nil)。

追踪器恐怕是发明以来最有用的调试工具，远比基于单步和断点的调试器有用。

Ruby 追踪 8 种事件：

line – 文件或行改变
class – 模块或类定义开始
end – 模块或类定义结束
call – Ruby 方法调用
return – Ruby 方法返回
c-call – C 方法调用
c-return – C 方法返回
raise – 异常抛出

有些事件并不会被追踪，例如设置和读取变量（局部、全局、实例或者类变量）。

set_trace_func proc { |event, file, line, id, binding, classname|
  STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}

@welcome  = "Hello"
separator = ", "
$object   = "world"
@@excl    = "!"
puts(@welcome + separator + $object + @@excl)

set_trace_func proc { |event, file, line, id, binding, classname|

STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname

}

@welcome = "Hello"

separator = ", "

$object = "world"

@@excl = "!"

puts(@welcome + separator + $object + @@excl)

$ ./example-1.rb
    line ./example-1.rb:7                false
    line ./example-1.rb:8                false
    line ./example-1.rb:9                false
    line ./example-1.rb:10               false
    line ./example-1.rb:11               false
  c-call ./example-1.rb:11          +   String
c-return ./example-1.rb:11          +   String
  c-call ./example-1.rb:11          +   String
c-return ./example-1.rb:11          +   String
  c-call ./example-1.rb:11          +   String
c-return ./example-1.rb:11          +   String
  c-call ./example-1.rb:11       puts   Kernel
  c-call ./example-1.rb:11      write       IO
c-return ./example-1.rb:11      write       IO
  c-call ./example-1.rb:11      write       IO
c-return ./example-1.rb:11      write       IO
c-return ./example-1.rb:11       puts   Kernel

$ ./example-1.rb

line ./example-1.rb:7 false

line ./example-1.rb:8 false

line ./example-1.rb:9 false

line ./example-1.rb:10 false

line ./example-1.rb:11 false

c-call ./example-1.rb:11 + String

c-return ./example-1.rb:11 + String

c-call ./example-1.rb:11 + String

c-return ./example-1.rb:11 + String

c-call ./example-1.rb:11 + String

c-return ./example-1.rb:11 + String

c-call ./example-1.rb:11 puts Kernel

c-call ./example-1.rb:11 write IO

c-return ./example-1.rb:11 write IO

c-call ./example-1.rb:11 write IO

c-return ./example-1.rb:11 write IO

c-return ./example-1.rb:11 puts Kernel

例子中，String#+ 运行了三次，IO#write 两次——第二次大概是输出换行符的。

Ruby 中很多东西很神奇并没有被转换为方法调用。这种东西的一种是字符串插值（... #{ code } ...），并没有被转换为 String#+ 或是 Array#join，而是直接由解释器特别的处理。

set_trace_func proc { |event, file, line, id, binding, classname|
  STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}
@welcome  = "Hello"
$object   = "world"
puts("#{@welcome}, #{$object}\n")

set_trace_func proc { |event, file, line, id, binding, classname|

STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname

}

@welcome = "Hello"

$object = "world"

puts("#{@welcome}, #{$object}\n")

$ ./example-2.rb
    line ./example-2.rb:7                false
    line ./example-2.rb:8                false
    line ./example-2.rb:9                false
  c-call ./example-2.rb:9        puts   Kernel
  c-call ./example-2.rb:9       write       IO
c-return ./example-2.rb:9       write       IO
c-return ./example-2.rb:9        puts   Kernel

$ ./example-2.rb

line ./example-2.rb:7 false

line ./example-2.rb:8 false

line ./example-2.rb:9 false

c-call ./example-2.rb:9 puts Kernel

c-call ./example-2.rb:9 write IO

c-return ./example-2.rb:9 write IO

c-return ./example-2.rb:9 puts Kernel

同样，创建类并没有被转化为 Class#new，添加方法并没有被转换为 Module#define_method。幸运的是 Ruby 提供了特别的钩子让我们需要捕捉这些事件——当类创建时，其父类的 inherited 方法会被调用；当一个方法被添加时，Ruby 调用其类的 method_added。

set_trace_func proc { |event, file, line, id, binding, classname|
  STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
} 

class Geolocation
  def initialize(lat, long)
    @lat  = lat
    @long = long
  end
  def to_s
    "<#{@lat}, #{@long}>"
  end
end

set_trace_func proc { |event, file, line, id, binding, classname|

STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname

}

class Geolocation

def initialize(lat, long)

@lat = lat

@long = long

end

def to_s

"<#{@lat}, #{@long}>"

end

$ ./example-3.rb
    line ./example-3.rb:7                false
  c-call ./example-3.rb:7   inherited    Class
c-return ./example-3.rb:7   inherited    Class
   class ./example-3.rb:7                false
    line ./example-3.rb:8                false
  c-call ./example-3.rb:8  method_added   Module
c-return ./example-3.rb:8  method_added   Module
    line ./example-3.rb:12               false
  c-call ./example-3.rb:12 method_added   Module
c-return ./example-3.rb:12 method_added   Module
     end ./example-3.rb:7                false

$ ./example-3.rb

line ./example-3.rb:7 false

c-call ./example-3.rb:7 inherited Class

c-return ./example-3.rb:7 inherited Class

class ./example-3.rb:7 false

line ./example-3.rb:8 false

c-call ./example-3.rb:8 method_added Module

c-return ./example-3.rb:8 method_added Module

line ./example-3.rb:12 false

c-call ./example-3.rb:12 method_added Module

c-return ./example-3.rb:12 method_added Module

end ./example-3.rb:7 false

如果我们想跟踪些有用的东西，是时候改改格式了。如果我们对跟踪执行过程感兴趣，通常来说文件及行号都没什么用，return/c-return/end 则应该用缩进代替。同时，相比类，我们更注重的是接收者（`self）。

$indent = 0
set_trace_func proc { |event, file, line, id, binding, classname|
  if event == "line"
    # 忽略
  elsif %w[return c-return end].include?(event)
    $indent -= 2
  else
    if event == "class"
      STDERR.printf "%*s%s %s\n", $indent, "", event, eval("self", binding)
    else
      STDERR.printf "%*s%s %s.%s\n", $indent, "", event, eval("self", binding), id
    end
    $indent += 2 if %w[call c-call class].include?(event)
  end
}

class Geolocation
  def initialize(lat, long)
    @lat  = lat
    @long = long
  end
  def to_s
    "<#{@lat}, #{@long}>"
  end
end

a = Geolocation.new(51.12, 17.03)
puts a
p a

$indent = 0

set_trace_func proc { |event, file, line, id, binding, classname|

if event == "line"

# 忽略

elsif %w[return c-return end].include?(event)

$indent -= 2

else

if event == "class"

STDERR.printf "%*s%s %s\n", $indent, "", event, eval("self", binding)

else

STDERR.printf "%*s%s %s.%s\n", $indent, "", event, eval("self", binding), id

end

$indent += 2 if %w[call c-call class].include?(event)

end

}

class Geolocation

def initialize(lat, long)

@lat = lat

@long = long

end

def to_s

"<#{@lat}, #{@long}>"

end

a = Geolocation.new(51.12, 17.03)

puts a

p a

$ ./example-4.rb  >/dev/null
c-call Object.inherited
class Geolocation
  c-call Geolocation.method_added
  c-call Geolocation.method_added
c-call Geolocation.new
  call <, >.initialize
c-call main.puts
  call <51.12, 17.03>.to_s
    c-call 51.12.to_s
    c-call 17.03.to_s
  c-call #<IO:0xb7c80fc0>.write
  c-call #<IO:0xb7c80fc0>.write
c-call main.p
  c-call <51.12, 17.03>.inspect
    c-call 17.03.inspect
      c-call 17.03.to_s
    c-call 51.12.inspect
      c-call 51.12.to_s
  c-call #<IO:0xb7c80fc0>.write
  c-call #<IO:0xb7c80fc0>.write

$ ./example-4.rb >/dev/null

c-call Object.inherited

class Geolocation

c-call Geolocation.method_added

c-call Geolocation.new

call <, >.initialize

c-call main.puts

call <51.12, 17.03>.to_s

c-call 51.12.to_s

c-call 17.03.to_s

c-call #<IO:0xb7c80fc0>.write

c-call main.p

c-call <51.12, 17.03>.inspect

c-call 17.03.inspect

c-call 17.03.to_s

c-call 51.12.inspect

c-call 51.12.to_s

c-call #<IO:0xb7c80fc0>.write

这里的输出又有点小怪癖。在 Geolocation#initialize 开始时， Geolocation#to_s 就被追踪器调用，而不是在其有意义之后。这种情况下，我们简单的得到了垃圾 <, >。但是有些对象，在初始化之前被打印是会产生异常的。

require "complex"

$indent = 0
set_trace_func proc { |event, file, line, id, binding, classname|
  if event == "line"
    # 忽略
  elsif %w[return c-return end].include?(event)
    $indent -= 2
  else
    obj = eval("self", binding)
    if event == "class"
      STDERR.printf "%*s%s %s\n", $indent, "", event, obj
    else
      obj = "<#{obj.class}##{obj.object_id}>" if id == :initialize
      STDERR.printf "%*s%s %s.%s\n", $indent, "", event, obj, id
    end
    $indent += 2 if %w[call c-call class].include?(event)
  end
}

a = Complex.new(11.0, -5.0)
b = Complex.new(2.0, 13.5)
c = a * b

require "complex"

$indent = 0

set_trace_func proc { |event, file, line, id, binding, classname|

if event == "line"

# 忽略

elsif %w[return c-return end].include?(event)

$indent -= 2

else

obj = eval("self", binding)

if event == "class"

STDERR.printf "%*s%s %s\n", $indent, "", event, obj

else

obj = "<#{obj.class}##{obj.object_id}>" if id == :initialize

STDERR.printf "%*s%s %s.%s\n", $indent, "", event, obj, id

end

$indent += 2 if %w[call c-call class].include?(event)

end

}

a = Complex.new(11.0, -5.0)

b = Complex.new(2.0, 13.5)

c = a * b

$ ./example-5.rb
c-call Complex.new
  call <Complex#-605829048>.initialize
    c-call 11.0.kind_of?
    c-call 11.0.kind_of?
    c-call -5.0.kind_of?
    c-call -5.0.kind_of?
c-call Complex.new
  call <Complex#-605832038>.initialize
    c-call 2.0.kind_of?
    c-call 2.0.kind_of?
    c-call 13.5.kind_of?
    c-call 13.5.kind_of?
call 11.0-5.0i.*
  c-call 2.0+13.5i.kind_of?
  c-call 11.0.*
  c-call -5.0.*
  c-call 22.0.-
  c-call 11.0.*
  c-call -5.0.*
  c-call 148.5.+
  call 11.0-5.0i.Complex
    c-call 138.5.==
    call 89.5.real
    call 138.5.imag
    c-call 89.5.-
    call 89.5.imag
    call 138.5.real
    c-call 0.+
    c-call Complex.new
      call <Complex#-605842188>.initialize
        c-call 89.5.kind_of?
        c-call 89.5.kind_of?
        c-call 138.5.kind_of?
        c-call 138.5.kind_of?

$ ./example-5.rb

c-call Complex.new

call <Complex#-605829048>.initialize

c-call 11.0.kind_of?

c-call -5.0.kind_of?

c-call Complex.new

call <Complex#-605832038>.initialize

c-call 2.0.kind_of?

c-call 13.5.kind_of?

call 11.0-5.0i.*

c-call 2.0+13.5i.kind_of?

c-call 11.0.*

c-call -5.0.*

c-call 22.0.-

c-call 11.0.*

c-call -5.0.*

c-call 148.5.+

call 11.0-5.0i.Complex

c-call 138.5.==

call 89.5.real

call 138.5.imag

c-call 89.5.-

call 89.5.imag

call 138.5.real

c-call 0.+

c-call Complex.new

call <Complex#-605842188>.initialize

c-call 89.5.kind_of?

c-call 138.5.kind_of?

许多难以找到错误的，最终变成了琐碎的自定义追踪器、grep 或是一些一行 Ruby 脚本来美化事实。

骗局

set_trace_func 不仅仅对调试有用，也适合做各种有趣的事情。Binding.of_caller 是用 set_trace_func 实现的，虽然已经不能用了。magic/help 用了 set_trace_func 在 irb 中提供方便的帮助：

irb> help { STDERR.write }
--------------------------------------------------------------- IO#write
     ios.write(string)    => integer
------------------------------------------------------------------------
     Writes the given string to _ios_. The stream must be opened for
     writing. If the argument is not a string, it will be converted to a
     string using +to_s+. Returns the number of bytes written.

        count = $stdout.write( "This is a test\n" )
        puts "That was #{count} bytes of data"

     _produces:_

        This is a test
        That was 15 bytes of data

irb> help { STDERR.write }

--------------------------------------------------------------- IO#write

ios.write(string) => integer

------------------------------------------------------------------------

Writes the given string to _ios_. The stream must be opened for

writing. If the argument is not a string, it will be converted to a

string using +to_s+. Returns the number of bytes written.

count = $stdout.write( "This is a test\n" )

puts "That was #{count} bytes of data"

_produces:_

This is a test

That was 15 bytes of data

magic/help 通过设置一个 set_trace_func 钩子工作，执行传送过来的块，然后一遇到感兴趣的事件就停止执行。这意味着 help { rm_rf "/" } 完全是安全的。

我开发 magic/help 时遇到的一个难懂的问题是，方法没有按正确的参数个数调用时，追踪结果不一致。

对于 C 写的方法，就算显式声明了参数个数，Ruby 依然在 ArgumentError.new 前产生 c-call。另一方面，对于 Ruby 写的方法，ArgumentError.new 被调用，然后从函数 return 返回，而不产生 call。

要描述这个问题，我们来跟踪一下 C 方法 STDERR.write：

c-call #<IO:0xb7c98fa8>.write
  c-call ArgumentError.new
    c-call #<ArgumentError: ArgumentError>.initialize
    c-return #<ArgumentError: wrong number of arguments (0 for 1)>.initialize
  c-return ArgumentError.new
  c-call #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
  c-return #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
  c-call #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
  c-return #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
  raise #<IO:0xb7c98fa8>.write
c-return #<IO:0xb7c98fa8>.write

c-call #<IO:0xb7c98fa8>.write

c-call ArgumentError.new

c-call #<ArgumentError: ArgumentError>.initialize

c-return #<ArgumentError: wrong number of arguments (0 for 1)>.initialize

c-return ArgumentError.new

c-call #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace

c-return #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace

c-call #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace

c-return #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace

raise #<IO:0xb7c98fa8>.write

c-return #<IO:0xb7c98fa8>.write

与此同时我们跟踪一下 Ruby 方法 FileUtils.remove_entry:

# "call FileUtils.remove_entry" 永远不会产生！
  c-call ArgumentError.new
    c-call #<ArgumentError: ArgumentError>.initialize
    c-return #<ArgumentError: wrong number of arguments (0 for 1)>.initialize
  c-return ArgumentError.new
  c-call #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
  c-return #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
  c-call #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
  c-return #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
  raise FileUtils.remove_entry
# "return FileUtils.remove_entry" 确实产生了。
return FileUtils.remove_entry

# "call FileUtils.remove_entry" 永远不会产生！

c-call ArgumentError.new

c-call #<ArgumentError: ArgumentError>.initialize

c-return #<ArgumentError: wrong number of arguments (0 for 1)>.initialize

c-return ArgumentError.new

c-call #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace

c-return #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace

c-call #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace

c-return #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace

raise FileUtils.remove_entry

# "return FileUtils.remove_entry" 确实产生了。

return FileUtils.remove_entry

两种情况下，参数都是在调用方法之前被检查的。IO#write 是定义为一个参数的方法：

rb_define_method(rb_cIO, "write", io_write, 1);

1	rb_define_method(rb_cIO, "write", io_write, 1);

新版的 magic/help 也能处理这个情况了——如果首事件是 c-call 到 ArgumentError.new，magic/help 就等到首个 return 事件（如果有）而不是立刻放弃执行传来的块。快去下载新版，享受神奇的帮助吧。

orzFly

His Memories from Here to There, with Fuzzy Ginkgo

set_trace_func 和骗局

骗局

One thought on “set_trace_func 和骗局”

骗局

Related posts:

One thought on “set_trace_func 和骗局”