Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Don't @ Me! Faster Instance Variables with Object Shapes

Don't @ Me! Faster Instance Variables with Object Shapes

This presentation is about the Object Shapes implementation in Ruby 3.2 and the impact that it has on the JIT compiler

Aaron Patterson

November 30, 2022
Tweet

More Decks by Aaron Patterson

Other Decks in Programming

Transcript

  1. Don't @ me!
    Faster Instance Variables with Object Shapes

    View Slide

  2. View Slide

  3. Aaron Patterson

    View Slide

  4. 15 min standup

    View Slide

  5. Ruby Core Team

    View Slide

  6. Rails Core Team

    View Slide

  7. @tenderlove
    mastodon.social/@tenderlove


    GitHub


    Cohost


    Instagram

    View Slide

  8. LinkedIn: tenderlove

    View Slide

  9. View Slide

  10. Ruby Infrastructure

    View Slide

  11. Instance Variables

    View Slide

  12. Instance Variables: TMI

    View Slide

  13. Object Shapes

    View Slide

  14. YJIT

    View Slide

  15. All in 30 min! LOL

    View Slide

  16. Thanks!!
    🥰🥰🥰🥰🥰🥰🥰🥰🥰
    Ruby Infrastructure

    YJIT team

    Jemma Issroff

    Maxime Chevalier-Boisvert

    John Hawthorn @ GitHub

    View Slide

  17. How do IVARS work?
    Don't @ me!

    View Slide

  18. Instance Variables = IVARs = IVs

    View Slide

  19. Implementing


    Instance Variables

    View Slide

  20. Instance Variables
    How to store them?
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    Hello.new
    Instance of
    Hello
    IV Hash Table
    IV Name IV Value
    :@foo 1
    :@bar 2

    View Slide

  21. Hash based implementation
    If it were written in Ruby
    class Object


    def initialize


    @ivs = {} # Magic instance variable hash


    end


    def instance_variable_set name, value


    @ivs[name] = value


    end


    def instance_variable_get name


    @ivs[name]


    end


    def instance_variable_defined? name


    # ooohhh, why did he put this method in the example?


    # I bet it's foreshadowing!


    @ivs.key? name


    end


    end

    View Slide

  22. Ruby <= 1.8.X

    View Slide

  23. Tree Walking Interpreter
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end
    +
    @foo @bar
    1 2
    3
    Hash Lookup! Hash Lookup!

    View Slide

  24. Ruby 1.9: YARV

    View Slide

  25. YARV Execution
    Code is compiled to instructions before execution
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    hi = Hello.new


    hi.foo
    Source Code Byte Code for "foo"
    [:getivar, :@foo]
    [:getivar, :@bar]
    [:plus]
    VM Stack
    1
    2
    3

    View Slide

  26. Instruction Implementation
    Example implementation written in Ruby
    def getivar name


    get_self.instance_variables[name]


    end
    getivar Implementation
    [:getivar, :@foo]
    [:getivar, :@bar]
    [:plus]
    Get self from
    current stack frame
    G
    et H
    ash
    of IVS

    View Slide

  27. Hashes are slow
    Compared to Arrays

    View Slide

  28. Hashes Use Memory
    Compared to Arrays

    View Slide

  29. Lets Use an Array!
    Instead of a Hash!

    View Slide

  30. Instance Variables
    How to store them?
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    Hello.new
    Instance of
    Hello
    IV Index Table
    Name Index
    :@foo 0
    :@bar 1
    Hello Class
    IV Array
    0 1
    1 2

    View Slide

  31. Instance Variables (second instance)
    How to store them?
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    Hello.new
    Instance of
    Hello
    IV Index Table
    Name Index
    :@foo 0
    :@bar 1
    Hello Class
    IV Array
    0 1
    1 2

    View Slide

  32. Instance Variables (many instances)
    Hash table size is amortized
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    Hello.new
    Instance of
    Hello
    IV Index Table
    Name Index
    :@foo 0
    :@bar 1
    Hello Class
    IV Array
    0 1
    1 2
    Instance of
    Hello
    0 1
    1 2

    View Slide

  33. Instance Variables Storage Location
    References are stored inside the object (it's in the computer)
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    Hello.new
    Instance of
    Hello
    In-Memory Layout
    Byte Index Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16
    24
    32
    First IV
    Second IV
    Third IV
    1
    2
    Qundef

    View Slide

  34. Storage Location Depends on Type
    Objects store instance variables "in line", others in an external table
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end
    class PleaseDoNotDoThis < Array


    def initialize


    @foo = 1


    @bar = 2


    super


    end


    def foo


    @foo + @bar


    end


    end

    View Slide

  35. Instruction Implementation
    "foo" method instructions
    def getivar name


    # get the class


    klass = get_self.class


    # get the index of the ivar


    index = klass.ivar_index[name]


    if get_self.is_a?(Object)


    # get the ivar value


    get_self.instance_variables[index]


    else


    # do something different


    end


    end
    getivar Implementation
    [:getivar, :@foo]
    [:getivar, :@bar]
    [:plus]
    Still doing a hash
    lookup
    😆

    View Slide

  36. Inline Caches

    View Slide

  37. Instruction Implementation
    "foo" method instructions, with inline caches
    def getivar name, cache


    # If there is no cached index


    unless cache.index


    # get the class


    klass = get_self.class


    # get the index of the ivar


    index = klass.ivar_index[name]


    # store the index


    cache.index = index


    end


    # get the cached index


    index = cache.index


    if get_self.is_a?(Object)


    # get the ivar value


    get_self.instance_variables[index]


    else


    # do something different


    end


    end
    getivar Implementation
    [:getivar, :@foo, cache]
    [:getivar, :@bar, cache]
    [:plus]
    Find and cache
    the index
    Use the cached
    index

    View Slide

  38. Usually No Hash Lookups!

    View Slide

  39. Cache Lookup Problem
    Name to Index mapping is per class
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    class World < Hello


    def initialize


    @oops = "yikes!!"


    super


    end


    end


    Hello.new.foo


    World.new.foo
    Instance of
    Hello
    IV Index Table
    Name Index
    :@foo 0
    :@bar 1
    Hello Class
    IV Array
    0 1
    1 2
    Cache Index 0 and 1

    View Slide

  40. Cache Lookup Problem
    Name to Index mapping is per class
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    class World < Hello


    def initialize


    @oops = "yikes!!"


    super


    end


    end


    Hello.new.foo


    World.new.foo
    IV Index Table
    Name Index
    :@foo 0
    :@bar 1
    Hello Class
    Cache Index 0 and 1
    Name Index
    :@oops 0
    :@foo 1
    :@bar 2
    World Class
    Oops was
    set
    fi
    rst!
    Oops was
    set
    fi
    rst!

    View Slide

  41. Use Class as a Cache Key

    View Slide

  42. Compare Class in Cache
    Cache miss if no index or the class doesn't match
    def getivar name, cache


    # get the class


    klass = get_self.class


    # If there is no cached index and class doesn't match


    if !(cache.index && cache.klass == klass)


    # get the index of the ivar


    index = klass.ivar_index[name]


    # store the index


    cache.index = index


    # store the class


    cache.klass = klass


    end


    # get the cached index


    index = cache.index


    if get_self.is_a?(Object)


    # get the ivar value


    get_self.instance_variables[index]


    else


    # do something different


    end


    end
    Class must match
    and IV index set
    Return value at
    index inside list

    View Slide

  43. Subclasses Cause Cache Misses
    Since the class is a cache key, subclasses can't share cache with superclass
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    class World < Hello


    end


    hello = Hello.new


    world = World.new


    loop do


    hello.foo


    world.foo


    end
    IV Index Table
    Name Index
    :@foo 0
    :@bar 1
    Hello Class
    Name Index
    :@foo 0
    :@bar 1
    World Class

    View Slide

  44. 🐵🔧

    View Slide

  45. class Hello


    def initialize(set_bar)


    @foo = 1


    @bar = 2 if set_bar


    @baz = 3


    end


    def foo


    if !instance_variable_defined?(:@bar)


    puts "oh!"


    end


    @foo + @bar.to_i


    end


    end


    p Hello.new(true).foo # => 3


    p Hello.new(false).foo # => 1
    Handling "Undefined" Instance Variables
    Unde
    fi
    ned IVs return `nil`, but how do we know it's unde
    fi
    ned?
    IV Index Table
    Name Index
    :@foo 0
    :@bar 1
    :@baz 2
    Hello Class
    Hello Instance
    In-Memory Layout
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16 1
    24 2
    32 3

    View Slide

  46. class Hello


    def initialize(set_bar)


    @foo = 1


    @bar = 2 if set_bar


    @baz = 3


    end


    def foo


    if !instance_variable_defined?(:@bar)


    puts "oh!"


    end


    @foo + @bar.to_i


    end


    end


    p Hello.new(true).foo # => 3


    p Hello.new(false).foo # => 1
    Handling "Undefined" Instance Variables
    Unde
    fi
    ned IVs return `nil`, but how do we know it's unde
    fi
    ned?
    IV Index Table
    Name Index
    :@foo 0
    :@bar 1
    :@baz 2
    Hello Class
    Hello Instance
    In-Memory Layout
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16 1
    24 Qundef (0x24)
    32 3
    Cache Index 0 and 1

    View Slide

  47. Return `nil` for Undefined IVs
    If the value stored in the array is Qundef, return nil, otherwise return the value
    def getivar name, cache


    # get the class


    klass = get_self.class


    # If there is no cached index and class doesn't match


    if !(cache.index && cache.klass == klass)


    # get the index of the ivar


    index = klass.ivar_index[name]


    # store the index


    cache.index = index


    # store the class


    cache.klass = klass


    end


    # get the cached index


    index = cache.index


    if get_self.is_a?(Object)


    # get the ivar value


    iv = get_self.instance_variables[index]


    if iv == Qundef


    nil


    else


    iv


    end


    else


    # do something different


    end


    end
    Return nil if Qundef

    View Slide

  48. 😫😫😫😫😫

    View Slide

  49. Conditionals for Reading an IV
    Just a Recap!
    • Is an index set?


    • Do the classes match?


    • Is it an "Object" type?


    • Is the IV value equal to Qundef?

    View Slide

  50. JIT Compilation

    View Slide

  51. JIT Compilation
    JIT compiler translates byte code to machine code
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    hi = Hello.new


    hi.foo
    Source Code Byte Code for "foo"
    [:getivar, :@foo, cache]
    [:getivar, :@bar, cache]
    [:plus]
    Machine Code
    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ======================


    # getinstancevariable


    # guard not immediate


    0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7


    0x55a658d0a6e5: jne 0x55a660d0a0e5


    0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8


    0x55a658d0a6f0: jbe 0x55a660d0a0fe


    0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20


    0x55a658d0a704: cmp qword ptr [rax + 8], rcx


    0x55a658d0a708: jne 0x55a660d0a117


    0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18]


    0x55a658d0a712: cmp qword ptr [rax + 0x10], 0


    0x55a658d0a717: jbe 0x55a660d0a0cc


    # guard embedded getivar


    0x55a658d0a71d: test word ptr [rax], 0x2000


    0x55a658d0a722: je 0x55a660d0a130


    0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34


    0x55a658d0a72d: mov ecx, 8


    0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18]


    0x55a658d0a737: mov qword ptr [rbx], rcx


    == BLOCK 2/5, ISEQ RANGE [3,6), 0 bytes =======================


    == BLOCK 3/5, ISEQ RANGE [3,6), 69 bytes ======================


    # getinstancevariable


    # regenerate_branch


    # getinstancevariable


    # regenerate_branch


    0x55a658d0a73a: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55a658d0a73e: movabs rcx, 0x7fbb2af48f20


    0x55a658d0a748: cmp qword ptr [rax + 8], rcx


    0x55a658d0a74c: jne 0x55a660d0a183


    0x55a658d0a752: mov rax, qword ptr [r13 + 0x18]


    0x55a658d0a756: cmp qword ptr [rax + 0x10], 1


    0x55a658d0a75b: jbe 0x55a660d0a162


    # guard embedded getivar


    0x55a658d0a761: test word ptr [rax], 0x2000


    0x55a658d0a766: je 0x55a660d0a19c


    0x55a658d0a76c: cmp qword ptr [rax + 0x20], 0x34


    0x55a658d0a771: mov ecx, 8


    0x55a658d0a776: cmovne rcx, qword ptr [rax + 0x20]


    0x55a658d0a77b: mov qword ptr [rbx + 8], rcx


    == BLOCK 4/5, ISEQ RANGE [6,8), 0 bytes =======================


    == BLOCK 5/5, ISEQ RANGE [6,9), 86 bytes ======================


    # opt_plus


    # regenerate_branch


    # opt_plus


    # guard arg0 fixnum


    # regenerate_branch


    0x55a658d0a77f: test byte ptr [rbx], 1


    0x55a658d0a782: je 0x55a660d0a1ef


    # guard arg1 fixnum


    0x55a658d0a788: test byte ptr [rbx + 8], 1


    0x55a658d0a78c: je 0x55a660d0a208


    0x55a658d0a792: mov rax, qword ptr [rbx]


    0x55a658d0a795: sub rax, 1


    0x55a658d0a799: add rax, qword ptr [rbx + 8]


    0x55a658d0a79d: jo 0x55a660d0a1ce


    0x55a658d0a7a3: mov qword ptr [rbx], rax


    # leave


    # RUBY_VM_CHECK_INTS(ec)


    0x55a658d0a7a6: mov eax, dword ptr [r12 + 0x24]


    0x55a658d0a7ab: not eax


    0x55a658d0a7ad: test dword ptr [r12 + 0x20], eax


    0x55a658d0a7b2: jne 0x55a660d0a221


    # pop stack frame


    0x55a658d0a7b8: mov rax, r13


    0x55a658d0a7bb: add rax, 0x40


    0x55a658d0a7bf: mov r13, rax


    Machine Code
    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ======================


    # getinstancevariable


    # guard not immediate


    0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7


    0x55a658d0a6e5: jne 0x55a660d0a0e5


    0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8


    0x55a658d0a6f0: jbe 0x55a660d0a0fe


    0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20


    0x55a658d0a704: cmp qword ptr [rax + 8], rcx


    0x55a658d0a708: jne 0x55a660d0a117


    0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18]


    0x55a658d0a712: cmp qword ptr [rax + 0x10], 0


    0x55a658d0a717: jbe 0x55a660d0a0cc


    # guard embedded getivar


    0x55a658d0a71d: test word ptr [rax], 0x2000


    0x55a658d0a722: je 0x55a660d0a130


    0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34


    0x55a658d0a72d: mov ecx, 8


    0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18]


    0x55a658d0a737: mov qword ptr [rbx], rcx


    == BLOCK 2/5, ISEQ RANGE [3,6), 0 bytes =======================


    == BLOCK 3/5, ISEQ RANGE [3,6), 69 bytes ======================


    # getinstancevariable


    # regenerate_branch


    # getinstancevariable


    # regenerate_branch


    0x55a658d0a73a: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55a658d0a73e: movabs rcx, 0x7fbb2af48f20


    0x55a658d0a748: cmp qword ptr [rax + 8], rcx


    0x55a658d0a74c: jne 0x55a660d0a183


    0x55a658d0a752: mov rax, qword ptr [r13 + 0x18]


    0x55a658d0a756: cmp qword ptr [rax + 0x10], 1


    0x55a658d0a75b: jbe 0x55a660d0a162


    # guard embedded getivar


    0x55a658d0a761: test word ptr [rax], 0x2000


    0x55a658d0a766: je 0x55a660d0a19c


    0x55a658d0a76c: cmp qword ptr [rax + 0x20], 0x34


    0x55a658d0a771: mov ecx, 8


    0x55a658d0a776: cmovne rcx, qword ptr [rax + 0x20]


    0x55a658d0a77b: mov qword ptr [rbx + 8], rcx


    == BLOCK 4/5, ISEQ RANGE [6,8), 0 bytes =======================


    == BLOCK 5/5, ISEQ RANGE [6,9), 86 bytes ======================


    # opt_plus


    # regenerate_branch


    # opt_plus


    # guard arg0 fixnum


    # regenerate_branch


    0x55a658d0a77f: test byte ptr [rbx], 1


    0x55a658d0a782: je 0x55a660d0a1ef


    # guard arg1 fixnum


    0x55a658d0a788: test byte ptr [rbx + 8], 1


    0x55a658d0a78c: je 0x55a660d0a208


    0x55a658d0a792: mov rax, qword ptr [rbx]


    0x55a658d0a795: sub rax, 1


    0x55a658d0a799: add rax, qword ptr [rbx + 8]


    0x55a658d0a79d: jo 0x55a660d0a1ce


    0x55a658d0a7a3: mov qword ptr [rbx], rax


    # leave


    # RUBY_VM_CHECK_INTS(ec)


    0x55a658d0a7a6: mov eax, dword ptr [r12 + 0x24]


    0x55a658d0a7ab: not eax


    0x55a658d0a7ad: test dword ptr [r12 + 0x20], eax


    0x55a658d0a7b2: jne 0x55a660d0a221


    # pop stack frame


    0x55a658d0a7b8: mov rax, r13


    0x55a658d0a7bb: add rax, 0x40


    0x55a658d0a7bf: mov r13, rax


    Machine Code
    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ======================


    # getinstancevariable


    # guard not immediate


    0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7


    0x55a658d0a6e5: jne 0x55a660d0a0e5


    0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8


    0x55a658d0a6f0: jbe 0x55a660d0a0fe


    0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20


    0x55a658d0a704: cmp qword ptr [rax + 8], rcx


    0x55a658d0a708: jne 0x55a660d0a117


    0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18]


    0x55a658d0a712: cmp qword ptr [rax + 0x10], 0


    0x55a658d0a717: jbe 0x55a660d0a0cc


    # guard embedded getivar


    0x55a658d0a71d: test word ptr [rax], 0x2000


    0x55a658d0a722: je 0x55a660d0a130


    0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34


    0x55a658d0a72d: mov ecx, 8


    0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18]


    0x55a658d0a737: mov qword ptr [rbx], rcx


    == BLOCK 2/5, ISEQ RANGE [3,6), 0 bytes =======================


    == BLOCK 3/5, ISEQ RANGE [3,6), 69 bytes ======================


    # getinstancevariable


    # regenerate_branch


    # getinstancevariable


    # regenerate_branch


    0x55a658d0a73a: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55a658d0a73e: movabs rcx, 0x7fbb2af48f20


    0x55a658d0a748: cmp qword ptr [rax + 8], rcx


    0x55a658d0a74c: jne 0x55a660d0a183


    0x55a658d0a752: mov rax, qword ptr [r13 + 0x18]


    0x55a658d0a756: cmp qword ptr [rax + 0x10], 1


    0x55a658d0a75b: jbe 0x55a660d0a162


    # guard embedded getivar


    0x55a658d0a761: test word ptr [rax], 0x2000


    0x55a658d0a766: je 0x55a660d0a19c


    0x55a658d0a76c: cmp qword ptr [rax + 0x20], 0x34


    0x55a658d0a771: mov ecx, 8


    0x55a658d0a776: cmovne rcx, qword ptr [rax + 0x20]


    0x55a658d0a77b: mov qword ptr [rbx + 8], rcx


    == BLOCK 4/5, ISEQ RANGE [6,8), 0 bytes =======================


    == BLOCK 5/5, ISEQ RANGE [6,9), 86 bytes ======================


    # opt_plus


    # regenerate_branch


    # opt_plus


    # guard arg0 fixnum


    # regenerate_branch


    0x55a658d0a77f: test byte ptr [rbx], 1


    0x55a658d0a782: je 0x55a660d0a1ef


    # guard arg1 fixnum


    0x55a658d0a788: test byte ptr [rbx + 8], 1


    0x55a658d0a78c: je 0x55a660d0a208


    0x55a658d0a792: mov rax, qword ptr [rbx]


    0x55a658d0a795: sub rax, 1


    0x55a658d0a799: add rax, qword ptr [rbx + 8]


    0x55a658d0a79d: jo 0x55a660d0a1ce


    0x55a658d0a7a3: mov qword ptr [rbx], rax


    # leave


    # RUBY_VM_CHECK_INTS(ec)


    0x55a658d0a7a6: mov eax, dword ptr [r12 + 0x24]


    0x55a658d0a7ab: not eax


    0x55a658d0a7ad: test dword ptr [r12 + 0x20], eax


    0x55a658d0a7b2: jne 0x55a660d0a221


    # pop stack frame


    0x55a658d0a7b8: mov rax, r13


    0x55a658d0a7bb: add rax, 0x40


    0x55a658d0a7bf: mov r13, rax


    Machine Code

    View Slide

  52. Machine Code for Reading an IV
    def getivar name, cache


    # get the class


    klass = get_self.class


    # If there is no cached index and class doesn't match


    if !(cache.index && cache.klass == klass)


    # get the index of the ivar


    index = klass.ivar_index[name]


    # store the index


    cache.index = index


    # store the class


    cache.klass = klass


    end


    # get the cached index


    index = cache.index


    if get_self.is_a?(Object)


    # get the ivar value


    iv = get_self.instance_variables[index]


    if iv == Qundef


    nil


    else


    iv


    end


    else


    # do something different


    end


    end
    Instruction Implementation
    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ======================


    # getinstancevariable


    # guard not immediate


    0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7


    0x55a658d0a6e5: jne 0x55a660d0a0e5


    0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8


    0x55a658d0a6f0: jbe 0x55a660d0a0fe


    0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20


    0x55a658d0a704: cmp qword ptr [rax + 8], rcx


    0x55a658d0a708: jne 0x55a660d0a117


    0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18]


    0x55a658d0a712: cmp qword ptr [rax + 0x10], 0


    0x55a658d0a717: jbe 0x55a660d0a0cc


    # guard embedded getivar


    0x55a658d0a71d: test word ptr [rax], 0x2000


    0x55a658d0a722: je 0x55a660d0a130


    0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34


    0x55a658d0a72d: mov ecx, 8


    0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18]


    0x55a658d0a737: mov qword ptr [rbx], rcx
    Generated Machine Code
    def getivar name, cache


    # get the class


    klass = get_self.class


    # If there is no cached index and class doesn't match


    if !(cache.index && cache.klass == klass)


    # get the index of the ivar


    index = klass.ivar_index[name]


    # store the index


    cache.index = index


    # store the class


    cache.klass = klass


    end


    # get the cached index


    index = cache.index


    if get_self.is_a?(Object)


    # get the ivar value


    iv = get_self.instance_variables[index]


    if iv == Qundef


    nil


    else


    iv


    end


    else


    # do something different


    end


    end
    Instruction Implementation
    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ======================


    # getinstancevariable


    # guard not immediate


    0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7


    0x55a658d0a6e5: jne 0x55a660d0a0e5


    0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8


    0x55a658d0a6f0: jbe 0x55a660d0a0fe


    0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20


    0x55a658d0a704: cmp qword ptr [rax + 8], rcx


    0x55a658d0a708: jne 0x55a660d0a117


    0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18]


    0x55a658d0a712: cmp qword ptr [rax + 0x10], 0


    0x55a658d0a717: jbe 0x55a660d0a0cc


    # guard embedded getivar


    0x55a658d0a71d: test word ptr [rax], 0x2000


    0x55a658d0a722: je 0x55a660d0a130


    0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34


    0x55a658d0a72d: mov ecx, 8


    0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18]


    0x55a658d0a737: mov qword ptr [rbx], rcx
    Generated Machine Code
    def getivar name, cache


    # get the class


    klass = get_self.class


    # If there is no cached index and class doesn't match


    if !(cache.index && cache.klass == klass)


    # get the index of the ivar


    index = klass.ivar_index[name]


    # store the index


    cache.index = index


    # store the class


    cache.klass = klass


    end


    # get the cached index


    index = cache.index


    if get_self.is_a?(Object)


    # get the ivar value


    iv = get_self.instance_variables[index]


    if iv == Qundef


    nil


    else


    iv


    end


    else


    # do something different


    end


    end
    Instruction Implementation
    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ======================


    # getinstancevariable


    # guard not immediate


    0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7


    0x55a658d0a6e5: jne 0x55a660d0a0e5


    0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8


    0x55a658d0a6f0: jbe 0x55a660d0a0fe


    0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20


    0x55a658d0a704: cmp qword ptr [rax + 8], rcx


    0x55a658d0a708: jne 0x55a660d0a117


    0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18]


    0x55a658d0a712: cmp qword ptr [rax + 0x10], 0


    0x55a658d0a717: jbe 0x55a660d0a0cc


    # guard embedded getivar


    0x55a658d0a71d: test word ptr [rax], 0x2000


    0x55a658d0a722: je 0x55a660d0a130


    0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34


    0x55a658d0a72d: mov ecx, 8


    0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18]


    0x55a658d0a737: mov qword ptr [rbx], rcx
    Generated Machine Code
    def getivar name, cache


    # get the class


    klass = get_self.class


    # If there is no cached index and class doesn't match


    if !(cache.index && cache.klass == klass)


    # get the index of the ivar


    index = klass.ivar_index[name]


    # store the index


    cache.index = index


    # store the class


    cache.klass = klass


    end


    # get the cached index


    index = cache.index


    if get_self.is_a?(Object)


    # get the ivar value


    iv = get_self.instance_variables[index]


    if iv == Qundef


    nil


    else


    iv


    end


    else


    # do something different


    end


    end
    Instruction Implementation
    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ======================


    # getinstancevariable


    # guard not immediate


    0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7


    0x55a658d0a6e5: jne 0x55a660d0a0e5


    0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8


    0x55a658d0a6f0: jbe 0x55a660d0a0fe


    0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20


    0x55a658d0a704: cmp qword ptr [rax + 8], rcx


    0x55a658d0a708: jne 0x55a660d0a117


    0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18]


    0x55a658d0a712: cmp qword ptr [rax + 0x10], 0


    0x55a658d0a717: jbe 0x55a660d0a0cc


    # guard embedded getivar


    0x55a658d0a71d: test word ptr [rax], 0x2000


    0x55a658d0a722: je 0x55a660d0a130


    0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34


    0x55a658d0a72d: mov ecx, 8


    0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18]


    0x55a658d0a737: mov qword ptr [rbx], rcx
    Generated Machine Code
    def getivar name, cache


    # get the class


    klass = get_self.class


    # If there is no cached index and class doesn't match


    if !(cache.index && cache.klass == klass)


    # get the index of the ivar


    index = klass.ivar_index[name]


    # store the index


    cache.index = index


    # store the class


    cache.klass = klass


    end


    # get the cached index


    index = cache.index


    if get_self.is_a?(Object)


    # get the ivar value


    iv = get_self.instance_variables[index]


    if iv == Qundef


    nil


    else


    iv


    end


    else


    # do something different


    end


    end
    Instruction Implementation

    View Slide

  53. 93 Bytes for 1 IV

    View Slide

  54. We Can Do Better!

    View Slide

  55. Object Shapes

    View Slide

  56. Not these shapes

    View Slide

  57. View Slide

  58. Shape Transitions on Write
    Shapes form a tree representing Object properties
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end
    Sample Code Shape Tree
    Root


    id: 0
    @foo


    id: 1, index: 0
    @bar


    id: 2, index: 1
    @foo
    @bar
    from: 0, to: 1, iv index: 0
    from:1, to: 2, iv index: 1
    Cache Key
    Cache Key
    Destination
    Shape
    Destination
    Shape
    IV Index
    IV Index

    View Slide

  59. Shape Tree
    Shape Transitions on Write
    Shape ID is used as the cache key
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end
    Sample Code
    Root


    id: 0
    @foo


    id: 1, index: 0
    @bar


    id: 2, index: 1
    @foo
    @bar
    from: 0, to: 1, iv index: 0
    from:1, to: 2, iv index: 1

    View Slide

  60. Shapes Form a Graph

    View Slide

  61. All Object Start From a


    "Root Shape"

    View Slide

  62. Objects Only Change Shape


    on Writes

    View Slide

  63. Shape ID is the Cache Key

    View Slide

  64. Class is not a cache key

    View Slide

  65. Object can share shapes
    Hello and World can share caches
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    end


    class World < Hello


    def initialize


    super


    @baz = 3


    end


    end


    Sample Code Shape Tree
    from: 0, to: 1, iv index: 0
    from:1, to: 2, iv index: 1
    from:2, to: 3, iv index: 2
    Root


    id: 0
    @foo


    id: 1, index: 0
    @bar


    id: 2, index: 1
    @baz


    id:3, index:2
    Shared between
    Hello and World
    instances

    View Slide

  66. Cross Type Memory Amortization

    View Slide

  67. Cross Type Cache Hits

    View Slide

  68. Shared Shape Tree
    Shape Tree is Shared
    All objects use the shape tree, so more types can share info
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    class World < Hello


    end


    hello = Hello.new


    world = World.new


    loop do


    hello.foo


    world.foo


    end
    IV Index Table
    Name Index
    :@foo 0
    :@bar 1
    Hello Class
    Name Index
    :@foo 0
    :@bar 1
    World Class
    Root


    id: 0
    @foo


    id: 1, index: 0
    @bar


    id: 2, index: 1
    Same shape on
    both instances
    Cache Shape 2 and 2

    View Slide

  69. Cross Type Cache Hits
    require 'harness'


    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    class World < Hello


    end


    hello = Hello.new


    world = World.new


    run_benchmark(100) do


    i = 0


    while i < 90_000


    hello.foo


    world.foo


    i += 1


    end


    end
    Microbenchmark
    before: ruby 3.2.0dev (2022-09-28T14:51:38Z before-shapes a05b261464) [x86_64-linux]


    after: ruby 3.2.0dev (2022-11-22T05:20:45Z master 20b9d7b9fd) [x86_64-linux]


    ------------------- ----------- ---------- ---------- ---------- ------------ -------------


    bench before (ms) stddev (%) after (ms) stddev (%) before/after after 1st itr


    getivar-polymorphic 12.1 1.4 4.4 2.1 2.76 2.82


    ------------------- ----------- ---------- ---------- ---------- ------------ -------------


    Legend:


    - before/after: ratio of before/after time. Higher is better for after. Above 1 represents a speedup.


    - after 1st itr: ratio of before/after time for the first benchmarking iteration.


    Results
    2.76x Faster

    View Slide

  70. Memory Usage Improvements

    View Slide

  71. Classes store their name as an IV

    View Slide

  72. Class Name is an IV
    Class names are stored as an instance variable on the class instance
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def foo


    @foo + @bar


    end


    end


    puts Hello.name # => IV read

    View Slide

  73. View Slide

  74. View Slide

  75. Not All Properties Are


    Instance Variables

    View Slide

  76. Freezing is a shape transition

    View Slide

  77. Freezing Changes Shape
    When we freeze an object, it changes shape
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    end


    def set


    @baz = 3


    end


    end


    hello = Hello.new


    hello.set


    hello = Hello.new


    hello.freeze


    hello.set
    Sample Code Shape Tree
    Root


    id: 0
    @foo


    id: 1, index: 0
    @bar


    id: 2, index: 1
    @baz


    id:3, index:2
    Shape: 2
    from: 0, to: 1, iv index: 0
    from: 1, to: 2, iv index: 1
    from: 2, to: 3, iv index: 2
    Shape: 3
    Shape: 2
    Shape: 4
    @foo
    @bar
    @baz
    frozen


    id:4
    freeze

    View Slide

  78. Set Instance Variable Instruction
    Frozen check only on cache misses
    def setinstancevariable iv_name, cache


    if get_self.frozen?


    raise "It's frozen!"


    end


    if cache.klass == get_self.class && cache.index


    # CACHE HIT!!


    # set the instance variable


    else


    cache.klass = get_self.class


    cache.index = get_self.iv_index_table[iv_name]


    # set the instance variable


    end


    end
    Before Shapes
    def setinstancevariable iv_name, cache


    if cache.from_shape_id == get_self.shape_id


    # CACHE HIT!!


    # set the instance variable


    else


    if get_self.frozen?


    raise "It's frozen!"


    end


    cache.shape_id = get_self.shape_id


    # set the instance variable


    end


    end
    After Shapes

    View Slide

  79. Frozen Checks only on


    Cache Misses

    View Slide

  80. IV Write Performance Improvement
    require 'harness'


    class TheClass


    def initialize


    @v0 = 1


    @v1 = 2


    @v3 = 3


    @levar = 1


    end


    def set_value_loop


    # 1M


    i = 0


    while i < 1000000


    # 10 times to de-emphasize loop overhead


    @levar = i


    @levar = i


    @levar = i


    @levar = i


    @levar = i


    @levar = i


    @levar = i


    @levar = i


    @levar = i


    @levar = i


    i += 1


    end


    end


    end


    obj = TheClass.new


    run_benchmark(100) do


    obj.set_value_loop


    end
    Micro Benchmark
    before: ruby 3.2.0dev (2022-09-28T14:51:38Z before-shapes a05b261464) [x86_64-linux]


    after: ruby 3.2.0dev (2022-11-22T05:20:45Z master 20b9d7b9fd) [x86_64-linux]


    ------- ----------- ---------- ---------- ---------- ------------ -------------


    bench before (ms) stddev (%) after (ms) stddev (%) before/after after 1st itr


    setivar 64.0 0.7 53.0 2.5 1.21 1.19


    ------- ----------- ---------- ---------- ---------- ------------ -------------


    Legend:


    - before/after: ratio of before/after time. Higher is better for after. Above 1
    represents a speedup.


    - after 1st itr: ratio of before/after time for the first benchmarking iteration
    Results
    21% Faster

    View Slide

  81. JIT Performance

    View Slide

  82. Object Layout
    All objects have 2 common
    fi
    elds: "
    fl
    ags" and "class"
    Basic Object Layout
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16
    24
    32
    T_OBJECT
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16 Instance Variable
    24 Instance Variable
    32 Instance Variable
    T_ARRAY
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16 Array Element
    24 Array Element
    32 Array Element

    View Slide

  83. Let's Check the Flags Field!

    View Slide

  84. 64 bits (width of a pointer)

    View Slide

  85. Flags Bitmap Layout
    Bottom 5 bits represent Object Type
    Flags Bitmap
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    Object Type ruby_value_type {


    RUBY_T_OBJECT = 0x01, /**< @see struct ::RObject */


    RUBY_T_CLASS = 0x02, /**< @see struct ::RClass and ::rb_cClass */


    RUBY_T_MODULE = 0x03, /**< @see struct ::RClass and ::rb_cModule */


    RUBY_T_FLOAT = 0x04, /**< @see struct ::RFloat */


    RUBY_T_STRING = 0x05, /**< @see struct ::RString */


    RUBY_T_REGEXP = 0x06, /**< @see struct ::RRegexp */


    RUBY_T_ARRAY = 0x07, /**< @see struct ::RArray */


    RUBY_T_HASH = 0x08, /**< @see struct ::RHash */


    RUBY_T_STRUCT = 0x09, /**< @see struct ::RStruct */


    RUBY_T_BIGNUM = 0x0a, /**< @see struct ::RBignum */


    RUBY_T_FILE = 0x0b, /**< @see struct ::RFile */


    RUBY_T_DATA = 0x0c, /**< @see struct ::RTypedData */


    RUBY_T_MATCH = 0x0d, /**< @see struct ::RMatch */


    RUBY_T_COMPLEX = 0x0e, /**< @see struct ::RComplex */


    RUBY_T_RATIONAL = 0x0f, /**< @see struct ::RRational */


    }

    View Slide

  86. Flags Bitmap Layout
    Bottom 12 bits have a common "meaning" (see
    fl
    _type.h)
    Flags Bitmap
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    Object Type
    Object ID has
    been seen?
    Object.new.object_id


    [].object_id

    View Slide

  87. Flags Bitmap Layout
    Bottom 12 bits have a common "meaning" (see
    fl
    _type.h)
    Flags Bitmap
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    Object Type
    Object ID has
    been seen?
    Object.new.object_id


    [].object_id

    View Slide

  88. Flags Bitmap Layout
    Object Type gives upper bits meaning
    Flags Bitmap
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    Object Type
    Common Stu
    ff
    (see
    fl
    _type.h)
    Object
    Extended?
    Depends on Object Type

    View Slide

  89. T_OBJECT Extended Layout
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16 Pointer to Bu
    ff
    er
    24
    32
    IV Array
    Byte Value
    0 Instance Variable
    8 Instance Variable
    16 Instance Variable
    24 Instance Variable
    32 Instance Variable
    ... ...
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    @baz = 3


    @hoge = 4


    end


    end


    Hello.new


    T_OBJECT Layout
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16 Instance Variable
    24 Instance Variable
    32 Instance Variable
    T_OBJECT Layout

    View Slide

  90. Flags Bitmap Layout
    Extended Bit means "read from external table"
    Flags Bitmap
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    Object Type
    Common Stu
    ff
    (see
    fl
    _type.h)
    Object
    Extended?
    Depends on Object Type

    View Slide

  91. JIT Compilation
    JIT compilation must write guards for assumptions
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    @baz = 3


    @hoge = 4


    end


    def foo


    @foo + @bar


    end


    end
    What is the type?
    Is it embedded or extended?
    Is the IV Qundef?
    Is the Class correct?

    View Slide

  92. Runtime Check Locations
    We need to test object type, extended bit, IV value
    Flags Bitmap
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    Object Type
    Common Stu
    ff
    (see
    fl
    _type.h)
    Object
    Extended?
    Depends on Object Type
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16 Instance Variable
    24 Instance Variable
    32 Instance Variable
    Object Type
    Qundef?
    Right
    class?

    View Slide

  93. Machine Code for reading one IV
    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ======================


    # getinstancevariable


    # guard not immediate


    0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7


    0x55a658d0a6e5: jne 0x55a660d0a0e5


    0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8


    0x55a658d0a6f0: jbe 0x55a660d0a0fe


    0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20


    0x55a658d0a704: cmp qword ptr [rax + 8], rcx


    0x55a658d0a708: jne 0x55a660d0a117


    0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18]


    0x55a658d0a712: cmp qword ptr [rax + 0x10], 0


    0x55a658d0a717: jbe 0x55a660d0a0cc


    # guard embedded getivar


    0x55a658d0a71d: test word ptr [rax], 0x2000


    0x55a658d0a722: je 0x55a660d0a130


    0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34


    0x55a658d0a72d: mov ecx, 8


    0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18]


    0x55a658d0a737: mov qword ptr [rbx], rcx
    Generated Machine Code

    View Slide

  94. Use Shapes to


    Eliminate Checks

    View Slide

  95. Shape ID Storage
    Shape id is stored in the upper 32 bits
    Flags Bitmap
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    Object Type
    Common Stu
    ff
    (see
    fl
    _type.h)
    Depends on Object Type
    Shape ID

    View Slide

  96. Class Check Isn't Necessary
    Shapes are independent of class
    Flags Bitmap
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    Object Type
    Common Stu
    ff
    (see
    fl
    _type.h)
    Object
    Extended?
    Depends on Object Type
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16 Instance Variable
    24 Instance Variable
    32 Instance Variable
    Object Type
    Qundef?
    Right
    class?
    Shape ID

    View Slide

  97. Handling "Undefined" Instance Variables
    Shapes care about IV set order
    class Hello


    def initialize(set_bar)


    @foo = 1


    @bar = 2 if set_bar


    @baz = 3


    end


    def foo


    if !instance_variable_defined?(:@bar)


    puts "oh!"


    end


    @foo + @bar.to_i


    end


    end


    p Hello.new(true).foo # => 3


    p Hello.new(false).foo # => 1
    Root


    id: 0
    @foo


    id: 1
    @bar


    id: 2
    @baz


    id: 3
    @baz


    id: 4
    Shape 3
    Shape 4

    View Slide

  98. Handling "Undefined" Instance Variables
    Shape 3 has a "bar" instance variable
    class Hello


    def initialize(set_bar)


    @foo = 1


    @bar = 2 if set_bar


    @baz = 3


    end


    def foo


    if !instance_variable_defined?(:@bar)


    puts "oh!"


    end


    @foo + @bar.to_i


    end


    end


    p Hello.new(true).foo # => 3


    p Hello.new(false).foo # => 1
    Shape 3
    Root


    id: 0
    @foo


    id: 1
    @bar


    id: 2
    @baz


    id: 3
    @baz


    id: 4

    View Slide

  99. Handling "Undefined" Instance Variables
    Shape 4 doesn't have a "bar" instance variable
    class Hello


    def initialize(set_bar)


    @foo = 1


    @bar = 2 if set_bar


    @baz = 3


    end


    def foo


    if !instance_variable_defined?(:@bar)


    puts "oh!"


    end


    @foo + @bar.to_i


    end


    end


    p Hello.new(true).foo # => 3


    p Hello.new(false).foo # => 1
    Shape 4
    Root


    id: 0
    @foo


    id: 1
    @bar


    id: 2
    @baz


    id: 3
    @baz


    id: 4

    View Slide

  100. Class Check Isn't Necessary
    Shapes are independent of class
    Flags Bitmap
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    Object Type
    Common Stu
    ff
    (see
    fl
    _type.h)
    Object
    Extended?
    Depends on Object Type
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16 Instance Variable
    24 Instance Variable
    32 Instance Variable
    Object Type
    Qundef?
    Shape ID

    View Slide

  101. Multiple Possible Layouts
    Objects can vary in width, so there are 2 possible layouts
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    @baz = 3


    end


    def foo


    @foo + @bar


    end


    end


    Hello.new
    Embedded Layout
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16 1
    24 2
    32 3
    Extended Layout
    Byte Value
    0 Flags (a 64 bit bitmap)
    8 Pointer to Class
    16 Pointer to Bu
    ff
    er
    24
    32
    IV Array
    Byte Value
    0 1
    8 2
    16 3
    24 ...
    32 ...
    ... ...

    View Slide

  102. Multiple Possible Layouts
    "Extending" adds a shape transition
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    @baz = 3


    end


    def foo


    @foo + @bar


    end


    end


    Hello.new
    Extended Layout
    Byte Value
    0 Flags
    8 Class
    16
    24
    Byte Val
    0 1
    8 2
    16 3
    24 ...
    32 ...
    ... ...
    IV Ptr
    1
    2
    Root


    id: 0
    @bar


    id: 2
    @foo


    id: 1
    EXTEND


    id: 3
    @baz


    id: 4
    Shape 4

    View Slide

  103. Multiple Possible Layouts
    "Extending" adds a shape transition
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    @baz = 3


    end


    def foo


    @foo + @bar


    end


    end


    Hello.new
    Root


    id: 0
    @bar


    id: 2
    @foo


    id: 1
    EXTEND


    id: 3
    @baz


    id: 4
    @baz


    id: 5
    Embedded Layout
    Byte Value
    0 Flags
    8 Class
    16
    24
    32
    2
    3
    1
    Shape 5

    View Slide

  104. Different Layouts Have Different Shapes
    JIT Compiler can di
    ff
    erentiate based on shape id
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    @baz = 3


    end


    def foo


    @foo + @bar


    end


    end


    Hello.new
    Root


    id: 0
    @bar


    id: 2
    @foo


    id: 1
    EXTEND


    id: 3
    @baz


    id: 4
    @baz


    id: 5
    Embedded Layout
    Byte Value
    0 Flags
    8 Class
    16 1
    24 2
    32 3
    Extended Layout
    Byte Value
    0 Flags
    8 Class
    16 PTR
    24
    Byte Val
    0 1
    8 2
    16 3
    24 ...
    32 ...
    ... ...

    View Slide

  105. Extended Check Isn't Necessary
    Shapes di
    ff
    er depending on embedded vs extended
    Flags Bitmap
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    Object Type
    Common Stu
    ff
    (see
    fl
    _type.h)
    Object
    Extended?
    Depends on Object Type
    Object Type
    Shape ID

    View Slide

  106. Different Types, Same Shape
    Di
    ff
    erent types can have the same shape, but IV storage is di
    ff
    erent
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    @baz = 3


    end


    end


    Hello.new


    ary = []


    ary.instance_variable_set(:@foo, 4)


    ary.instance_variable_set(:@bar, 5)


    ary.instance_variable_set(:@baz, 6)


    ary
    Sample Code Shape Tree
    Root


    id: 0
    @foo


    id: 1
    @bar


    id: 2
    @baz


    id: 3
    Shape 3
    Shape 3

    View Slide

  107. Different Types Store Instance
    Variables Differently.

    View Slide

  108. Assign Shape at Allocation Time
    When a T_OBJECT is allocated, immediately set a new shape
    class Hello


    def initialize


    @foo = 1


    @bar = 2


    @baz = 3


    end


    end


    Hello.new


    ary = []


    ary.instance_variable_set(:@foo, 4)


    ary.instance_variable_set(:@bar, 5)


    ary.instance_variable_set(:@baz, 6)


    ary
    Sample Code Shape Tree
    Root


    id: 0
    Shape 4
    Shape 7
    T_OBJECT


    id: 1
    @foo


    id: 2
    @bar


    id: 3
    @baz


    id: 4
    @foo


    id: 5
    @bar


    id: 6
    @baz


    id: 7

    View Slide

  109. Object Type Check Isn't Necessary
    Shapes di
    ff
    er depending on object type
    Flags Bitmap
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    Object Type
    Common Stu
    ff
    (see
    fl
    _type.h)
    Depends on Object Type
    Object Type
    Shape ID

    View Slide

  110. Only Shape ID Check is Required

    View Slide

  111. JIT Code Comparison
    Machine code for reading 1 instance variable
    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ======================


    # getinstancevariable


    # guard not immediate


    0x55ce5998b6dd: test qword ptr [r13 + 0x18], 7


    0x55ce5998b6e5: jne 0x55ce6198b0e5


    0x55ce5998b6eb: cmp qword ptr [r13 + 0x18], 8


    0x55ce5998b6f0: jbe 0x55ce6198b0fe


    0x55ce5998b6f6: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55ce5998b6fa: movabs rcx, 0x7f09fd1e8f30


    0x55ce5998b704: cmp qword ptr [rax + 8], rcx


    0x55ce5998b708: jne 0x55ce6198b117


    0x55ce5998b70e: mov rax, qword ptr [r13 + 0x18]


    0x55ce5998b712: cmp qword ptr [rax + 0x10], 0


    0x55ce5998b717: jbe 0x55ce6198b0cc


    # guard embedded getivar


    0x55ce5998b71d: test word ptr [rax], 0x2000


    0x55ce5998b722: je 0x55ce6198b130


    0x55ce5998b728: cmp qword ptr [rax + 0x18], 0x34


    0x55ce5998b72d: mov ecx, 8


    0x55ce5998b732: cmovne rcx, qword ptr [rax + 0x18]


    0x55ce5998b737: mov qword ptr [rbx], rcx
    Before Object Shapes
    == BLOCK 1/5, ISEQ RANGE [0,3), 40 bytes ======================


    # getinstancevariable


    0x5594850ba13a: mov rax, qword ptr [r13 + 0x18]


    # guard object is heap


    0x5594850ba13e: test al, 7


    0x5594850ba141: jne 0x5594850bc090


    0x5594850ba147: cmp rax, 0


    0x5594850ba14b: je 0x5594850bc090


    # guard shape


    0x5594850ba151: cmp dword ptr [rax + 4], 0x19


    0x5594850ba155: jne 0x5594850bc0a9


    0x5594850ba15b: mov rax, qword ptr [rax + 0x10]


    0x5594850ba15f: mov qword ptr [rbx], rax
    After Object Shapes

    View Slide

  112. JIT Code Comparison
    Machine code for reading 1 instance variable
    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ======================


    # getinstancevariable


    # guard not immediate


    0x55ce5998b6dd: test qword ptr [r13 + 0x18], 7


    0x55ce5998b6e5: jne 0x55ce6198b0e5


    0x55ce5998b6eb: cmp qword ptr [r13 + 0x18], 8


    0x55ce5998b6f0: jbe 0x55ce6198b0fe


    0x55ce5998b6f6: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55ce5998b6fa: movabs rcx, 0x7f09fd1e8f30


    0x55ce5998b704: cmp qword ptr [rax + 8], rcx


    0x55ce5998b708: jne 0x55ce6198b117


    0x55ce5998b70e: mov rax, qword ptr [r13 + 0x18]


    0x55ce5998b712: cmp qword ptr [rax + 0x10], 0


    0x55ce5998b717: jbe 0x55ce6198b0cc


    # guard embedded getivar


    0x55ce5998b71d: test word ptr [rax], 0x2000


    0x55ce5998b722: je 0x55ce6198b130


    0x55ce5998b728: cmp qword ptr [rax + 0x18], 0x34


    0x55ce5998b72d: mov ecx, 8


    0x55ce5998b732: cmovne rcx, qword ptr [rax + 0x18]


    0x55ce5998b737: mov qword ptr [rbx], rcx
    Before Object Shapes
    == BLOCK 1/5, ISEQ RANGE [0,3), 40 bytes ======================


    # getinstancevariable


    0x5594850ba13a: mov rax, qword ptr [r13 + 0x18]


    # guard object is heap


    0x5594850ba13e: test al, 7


    0x5594850ba141: jne 0x5594850bc090


    0x5594850ba147: cmp rax, 0


    0x5594850ba14b: je 0x5594850bc090


    # guard shape


    0x5594850ba151: cmp dword ptr [rax + 4], 0x19


    0x5594850ba155: jne 0x5594850bc0a9


    After Object Shapes
    Make sure it's
    shape 0x19

    View Slide

  113. JIT Code Comparison
    Machine code for reading 1 instance variable
    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ======================


    # getinstancevariable


    # guard not immediate


    0x55ce5998b6dd: test qword ptr [r13 + 0x18], 7


    0x55ce5998b6e5: jne 0x55ce6198b0e5


    0x55ce5998b6eb: cmp qword ptr [r13 + 0x18], 8


    0x55ce5998b6f0: jbe 0x55ce6198b0fe


    0x55ce5998b6f6: mov rax, qword ptr [r13 + 0x18]


    # guard known class


    0x55ce5998b6fa: movabs rcx, 0x7f09fd1e8f30


    0x55ce5998b704: cmp qword ptr [rax + 8], rcx


    0x55ce5998b708: jne 0x55ce6198b117


    0x55ce5998b70e: mov rax, qword ptr [r13 + 0x18]


    0x55ce5998b712: cmp qword ptr [rax + 0x10], 0


    0x55ce5998b717: jbe 0x55ce6198b0cc


    # guard embedded getivar


    0x55ce5998b71d: test word ptr [rax], 0x2000


    0x55ce5998b722: je 0x55ce6198b130


    0x55ce5998b728: cmp qword ptr [rax + 0x18], 0x34


    0x55ce5998b72d: mov ecx, 8


    0x55ce5998b732: cmovne rcx, qword ptr [rax + 0x18]


    0x55ce5998b737: mov qword ptr [rbx], rcx
    Before Object Shapes
    == BLOCK 1/5, ISEQ RANGE [0,3), 40 bytes ======================


    # getinstancevariable


    0x5594850ba13a: mov rax, qword ptr [r13 + 0x18]


    # guard object is heap


    0x5594850ba13e: test al, 7


    0x5594850ba141: jne 0x5594850bc090


    0x5594850ba147: cmp rax, 0


    0x5594850ba14b: je 0x5594850bc090


    # guard shape


    0x5594850ba151: cmp dword ptr [rax + 4], 0x19


    0x5594850ba155: jne 0x5594850bc0a9


    0x5594850ba15b: mov rax, qword ptr [rax + 0x10]


    0x5594850ba15f: mov qword ptr [rbx], rax
    After Object Shapes
    Read the IV,
    and push on the
    stack

    View Slide

  114. Benchmark Comparison
    Measure the cost of fetching and instance variable
    class TheClass


    def initialize


    @v0 = 1


    @v1 = 2


    @v3 = 3


    @levar = 1


    end


    def get_value_loop


    sum = 0


    # 1M


    i = 0


    while i < 1000000


    # 10 times to de-emphasize loop overhead


    sum += (@levar +


    @levar +


    @levar +


    @levar +


    @levar +


    @levar +


    @levar +


    @levar +


    @levar +


    @levar)


    i += 1


    end


    return sum


    end


    end


    obj = TheClass.new


    run_benchmark(100) do


    obj.get_value_loop


    end

    View Slide

  115. Benchmark Results
    before: ruby 3.2.0dev (2022-09-28T14:51:38Z before-shapes a05b261464) +YJIT [x86_64-linux]


    after: ruby 3.2.0dev (2022-11-22T05:20:45Z master 20b9d7b9fd) +YJIT [x86_64-linux]


    ------- ----------- ---------- ---------- ---------- ------------ -------------


    bench before (ms) stddev (%) after (ms) stddev (%) before/after after 1st itr


    getivar 17.4 0.5 12.0 0.3 1.45 0.97


    ------- ----------- ---------- ---------- ---------- ------------ -------------


    Legend:


    - before/after: ratio of before/after time. Higher is better for after. Above 1 represents a speedup.


    - after 1st itr: ratio of before/after time for the first benchmarking iteration.
    45% Speed
    up!

    View Slide

  116. Before Shapes: 3.76x

    View Slide

  117. After Shapes: 5.42x

    View Slide

  118. Future: 32 byte Objects

    View Slide

  119. TL;DR

    View Slide

  120. Fewer Checks

    View Slide

  121. Faster Code

    View Slide

  122. Thank You!!

    View Slide