Ruby 3.3
- Released at: Dec 25, 2023 (NEWS.md file)
- Status (as of Dec 28, 2023): 3.3.0 is recently released
- This document first published: Dec 25, 2023
- Last change to this document: Dec 28, 2023
🇺🇦 🇺🇦 Before you start reading the changelog: A full-scale Russian invasion into my home country continues, for the second year in a row. The only reason I am alive and able to work on the changelog is Armed Forces of Ukraine, and international support with weaponry, funds and information. I got into the Army in March, and spent the summer on the frontlines. Now I have moved to another army position which frees some time to work for the Ruby community. You can read my recent text (that supposed to be a RubyConf talk) as an appeal to the community. Please spread information, lobby our cause and donate.🇺🇦 🇺🇦
Note: As already explained in Introduction, this site is dedicated to changes in the language, not the implementation, therefore the list below lacks mentions of lots of important internal changes related to performance optimizations, parser, and JIT that happened in 3.3 (which is, on the other hand, somewhat lighter on the “small quality of life improvement” changes). The changes aren’t covered not because they are not important, just because this site’s goals are different. See the official release notes that cover those significant internal changes.
Highlights
it
will become anonymous block argument in 3.4Module#set_temporary_name
ObjectSpace::WeakKeyMap
Range#overlap?
Fiber#kill
Language changes
Standalone it
in blocks will become anonymous argument in Ruby 3.4
In Ruby 3.3, it will just warn to prepare for a change.
- Reason: Numeric designation for anonymous block arguments (
_1
,_2
, and so on) were considered ugly by many people, so after years of discussion, theit
keyword is to be introduced on the next Ruby version; for now, it just warns in places where it would be considered an anonymous block argument. - Discussion: Feature #18980
- Code: In the code below, where Ruby 3.3 currently produces a warning, Ruby 3.4 would treat
it
as an anonymous block argument; where Ruby 3.3 doesn’t produce a warning, Ruby 3.4 would treatit
as a local variable name or a method call (and would look for such names available in the scope).# The cases that are warned: # ------------------------- # warning: `it` calls without arguments will refer to the first block param in Ruby 3.4; use it() or self.it (1..3).map { it } # inside a block without explicit parameters (1..3).map { it; _1 } # ...even if numbered parameters are used, too def it; end (1..3).map { it } # even if a method with name `it` exists in the scope # The cases that are not warned: # ----------------------------- it # not inside a block (1..3).map { |x| it } # inside a block with named parameters (1..3).map { || it } # ...even if they are empty (1..3).map { it() } # with parentheses (1..3).map { it {} } # with a block attached (1..3).map { it = 5; it } # if a local variable with the same name is created in the block it = 5 (1..3).map { it } # if a local variable with the same name is in the scope
- Notes: The new feature isn’t expected to conflict with RSpec’s
it
, as calling that without any block attached, or at least a description for the future example, is useless.
Anonymous parameters forwarding inside blocks are disallowed
Now anonymous parameters forwarding inside a block raise error.
- Reason: Blocks didn’t support anonymous parameters forwarding, yet they supported anonymous parameters declaration, and it was a confusing situation (when something that looked like block forwarding its parameters, actually forwarded parameters of the method containing the block).
- Discussion: Feature #19370
- Code:
def m(*) # ..some other code using anonymous params... [1, 2, 3].each { |*| p(*) } end m('test') # Ruby 3.2: # The block above looks like it would forward its arguments to p # (so it would print 1, 2, 3); but actually anonymous params of the _method_ # are forwarded, so it actually prints: # "test" # "test" # "test" # Ruby 3.3: # anonymous rest parameter is also used within block (SyntaxError) # (raised during parsing the file) # No error is raised if there's no perceived conflict of anonymous # params: def m(*) # ..some other code using anonymous params... [1, 2, 3].each { |i| p(*) } # no question what `*` refers to end m('test') # Ruby 3.3: # "test" # "test" # "test"
- Notes:
- There is a question whether disallowing block parameters forwarding is the best way to solve the confusion; alternative solution would be just to support forwarding inside the block properly. I hope the discussion to continue during 3.4 development.
- In the 3.3.0 release, the prohibition was accidentally too greedy, affecting lambdas with unambiguous forwarding, see Bug #20090:
def b(*) -> { c(*) } # Unambiguous, yet raises: # anonymous rest parameter is also used within block (SyntaxError) end
This is already fixed to be released in the next minor version.
Core classes and modules
Kernel#lambda
raises when passed Proc
instance
- Reason:
lambda
’s goal is to create a lambda from provided literal block; in Ruby, it is impossible to change the “lambdiness” of the block once it is created. Butlambda(&proc_instance)
never notified users of that, which was confusing. - Discussion: Feature #19777
- Documentation:
Kernel#lambda
(no specific details are provided, though) - Code:
# Intended usage: l = lambda { |a, b| a + b } l.lambda? #=> true l.parameters #=> [[:req, :a], [:req, :b]] # Unintended usage: p = proc { |a, b| a + b } # In Ruby 3.2 and below, it worked, but the produced value wasn't lambda: l = lambda(&p) l.parameters #=> [[:opt, :a], [:opt, :b]] l.lambda? #=> false l.object_id == p.object_id #=> true, it is just the same proc # Ruby 3.3: l = lambda(&p) # in `lambda': the lambda method requires a literal block (ArgumentError) # Despite the message about a "literal block," the method # works (though has no meaningful effect) with lambda-like Proc objects other_lambda = lambda { |a, b| a + b } lambda(&other_lambda) #=> works lambda(&:to_s) #=> works lambda(&method(:puts)) #=> works
- Notes: The discussion was once started from the proposal to make
lambda
change “lambiness” of a passed block, but it raises multiple issues (changing the block semantics mid-program is just one of them). In general,lambda
as a method is considered legacy, inferior to the-> { }
lambda literal syntax, exactly due to problems like this: it looks like a regular method that receives a block, and therefore should be able to accept any block, but in fact it is “special” method. So in 3.0, there was a warning aboutlambda(&proc_instance)
, and since 3.3, the warning finally turned into an error.
Proc#dup
and #clone
call #initialize_dup
and #initialize_copy
- Reason: A fix for a small inconsistency created in 3.2: Since that version,
#dup
and#clone
on an object inherited from theProc
, rightfully produced an instance of the inherited class. But despiteObject
’s#dup
and#clone
methods docs claiming that corresponding copying constructors would be called on object cloning/duplication, it was not true forProc
. - Discussion: Feature #19362
- Documentation: — (Adheres to the behavior described for
Object#dup
and#clone
) - Code:
# The examples would work the same way with # #dup/#initialize_dup and #clone/#initialize_copy class TaggedProc < Proc attr_reader :tag def initialize(tag) super() @tag = tag end def initialize_dup(other) @tag = other.tag super end end proc = TaggedProc.new('admin') { } proc.tag #=> 'admin' proc.dup.tag # Ruby 3.1: # undefined method `tag' for #<Proc:0x0...> -- #dup didn't preserve the class # Ruby 3.2: # => nil -- the class is preserved, yet the duplication didn't went through #initialize_dup # Ruby 3.3: # => "admin"
- Notes: Inheriting from core classes is an advanced technique, and most of the times there are simple ways to achieve same goals (like wrapper objects containing a
Proc
and an additional info).
Module#set_temporary_name
Allows to assign a string to be rendered as class/module’s #name
, without assigning the class/module to a constant.
- Reason: The feature is useful to provide reasonable representation for dynamically auto-generated classes without assigning them to constants (which pollutes the global namespace and might conflict with existing constants) or redefining
Class#name
(which might break other code and not always respected in the output). - Discussion: Feature #19521
- Documentation:
Module#set_temporary_name
- Code:
dynamic_class = Class.new do def foo; end end dynamic_class.name #=> nil # For dynamic classes, representation of related values is frequently unreadable: dynamic_class #=> #<Class:0x0...> instance = dynamic_class.new #=> #<#<Class:0x0...>:0x0...> instance.method(:foo) #=> #<Method: #<Class:0x0...>#foo() ...> dynamic_class::Nested = Module.new dynamic_class::Nested #=> #<Class:0x0...>::Nested # After assigning the temporary name, representation becomes more convenient: dynamic_class.set_temporary_name("MyDSLClass(with description)") dynamic_class #=> MyDSLClass(with description) instance #=> #<MyDSLClass(with description):0x0...> instance.method(:foo) #=> #<Method: MyDSLClass(with description)#foo() ...> # Note that module constant names are assigned at the moment of their creation, # and don't change when the temporary name is assigned: dynamic_class::OtherNested = Module.new dynamic_class::Nested #=> #<Class:0x0...>::Nested dynamic_class::OtherNested #=> MyDSLClass(with description)::OtherNested # Assigning names that correspond to constant name rules is prohibited: dynamic_class.set_temporary_name("MyClass") # `set_temporary_name': the temporary name must not be a constant path to avoid confusion (ArgumentError) dynamic_class.set_temporary_name("MyClass::NestedName") # `set_temporary_name': the temporary name must not be a constant path to avoid confusion (ArgumentError) # When the module with a temporary name is put into a constant, # it receives a permanent name, which can't be changed anymore C = dynamic_class # It affects all associated values (including modules) dynamic_class #=> C instance #=> #<C:0x0...> instance.method(:foo) #=> #<Method: C#foo() ...> dynamic_class::Nested #=> C::Nested dynamic_class::OtherNested #=> C::OtherNested dynamic_class.set_temporary_name("Can I have it back?") # `set_temporary_name': can't change permanent name (RuntimeError) # `nil` can be used to cleanup a temporary name: other_class = Class.new other_class.set_temporary_name("another one") other_class #=> another one other_class.set_temporary_name(nil) other_class #=> #<Class:0x0...>
- Notes: Any phrase that used as a temporary name would be used verbatim; this might create very confusing
#inspect
results and error messages; so it is advised to use strings somehow implying that the name belong to a module. Imagine we wrap into classes with temporary names RSpec-style examples, and then there is a typo in the body of such example:it "works as a calculator" do expec(2+2).to eq 4 end # If we assign just the example description as a temp.name, the # error would look like this: # # undefined method `expec' for an instance of works as a calculator # ^^^^^^^^^^^^^^^^^^^^^ # # ...which is confusing. So it is probably better to construct a # module-like temporary name, to have: # # undefined method `expec' for an instance of MyFramework::Example("works as a calculator") # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Refinement#refined_class
is renamed to Refinement#target
Just a renaming of the unfortunately named new method that emerged in Ruby 3.2.
- Discussion: Feature #19714
- Documentation:
Refinement#target
Strings and regexps
String#bytesplice
: new arguments to select a portion of the replacement string
The low-level string manipulation method now allows to provide a coordinates of the part of the replacement string to be used.
- Reason: The new “byte-oriented” methods were introduced in Ruby 3.2 to support low-level programming like text editors or network protocol implementations. In those use cases, the necessity of copying of a small part of one string into the middle of another is frequent, and producing intermediate strings (by first slicing the necessary part) is costly.
- Discussion: Feature #19314
- Documentation:
String#bytesplice
- Code:
# Base usage buf1 = "Слава Україні!" # ^^^^^^^ - bytes 11-24 buf2 = "Шана Героям" # ^^^^^^ - bytes 9-20 buf1.bytesplice(11..24, buf2, 9..20) #=> "Слава Героям!" buf1 #=> "Слава Героям!" -- The receiver is modified # Or, alternatively, with (start, length) pairs buf1 = "Слава Україні!" buf1.bytesplice(11, 14, buf2, 9, 12) #=> "Слава Героям!" # Two forms can't be mixed: buf1 = "Слава Україні!" buf1.bytesplice(11..24, buf2, 9, 12) # `bytesplice': wrong number of arguments (given 4, expected 2, 3, or 5) (ArgumentError) # Index can't be in the middle of the Unicode character: buf1.bytesplice(11..23, buf2, 9..20) # ^ # `bytesplice': offset 24 does not land on character boundary (IndexError) buf1.bytesplice(11..24, buf2, 9..19) # ^ # `bytesplice': offset 20 does not land on character boundary (IndexError) # Semi-open ranges work: buf1 = "Слава Україні!" buf1.bytesplice(11..24, buf2, 9..) #=> "Слава Героям!" buf1 = "Слава Україні!" buf1.bytesplice(11..24, buf2, ...8) #=> "Слава Шана!" # Empty ranges lead to inserting empty strings: buf1 = "Слава Україні!" buf1.bytesplice(11..24, buf2, 9...8) #=> "Слава !"
MatchData#named_captures
: symbolize_names:
argument
- Discussion: Feature #19591
- Documentation:
MatchData#named_captures
- Code:
m = "2023-12-25".match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/) m.named_captures #=> {"year"=>"2023", "month"=>"12", "day"=>"25"} m.named_captures(symbolize_names: true) #=> {:year=>"2023", :month=>"12", :day=>"25"}
- Notes: While
symbolize_names:
might looks somewhat strange (usually we talk about hash keys), it is done for consistency with Ruby standard library’s`JSON.parse`
signature, which inherited the terminology from the JSON specification.
Time.new
with string argument became stricter
The method now requires fully-specified date-time string.
- Discussion: Bug #19293
- Documentation:
Time#new
- Code:
Time.new('2023-12-20') # Ruby 3.2: #=> 2023-12-20 00:00:00 +0200 # Ruby 3.3: in `initialize': no time information (ArgumentError) Time.new('2023-12') # Ruby 3.2: #=> 2023-12-01 00:00:00 +0200 # Ruby 3.3: in `initialize': no time information (ArgumentError) # Singular year is still works: Time.new('2023') #=> 2023-01-01 00:00:00 +0200 # ...because it is documented behavior of Time.new to accept # strings that are numeric and treat them as numbers: Time.new('2023', '12', '20') #=> 2023-12-20 00:00:00 +0200
Array#pack
and String#unpack
: raise ArgumentError
for unknown directives
- Discussion: Bug #19150
- Documentation:
doc/packed_data.rdoc
- Code:
[1, 2, 3].pack('r*') # Ruby 3.1: # => "", no warning # Ruby 3.2: # => "", warning: unknown pack directive 'r' in 'r*' # Ruby 3.3: # in `pack': unknown pack directive 'r' in 'r*' (ArgumentError) "\x01\x02\x03".unpack("r*") # Ruby 3.1: # => [], no warning # Ruby 3.2: # => [], warning: unknown unpack directive 'r' in 'r*' # Ruby 3.3: # in `unpack': unknown pack directive 'r' in 'r*' (ArgumentError)
Enumerables and collections
Set#merge
accepts multiple arguments
- Documentation:
Set#merge
- Code:
Set[1, 2, 3].merge(Set[3, 4, 5], Set[:a, :b, :c]) #=> #<Set: {1, 2, 3, 4, 5, :a, :b, :c}>
- Notes: The method’s signature (seen in docs) has a rare clause
**nil
. It means “don’t accept something that looks like keyword arguments.” As#merge
accept any list of enumerables, this protects from accidentally passing a hash believing it would be keyword arguments with some meaning:Set[1, 2, 3].merge(Set[3, 4, 5], reorder: false) # ^^^^^^^^^^^^^^ # Without **nil, this would be treated implicitly as Hash, while looking like keyword arguments # But actually, it produces # no keywords accepted (ArgumentError) # When you do mean to merge data from hash, use parentheses to make it explicit # (its #each would be used to produce set items): Set[1, 2, 3].merge(Set[3, 4, 5], {some: 'data'}) #=> #<Set: {1, 2, 3, 4, 5, [:some, "data"]}>
ObjectSpace::WeakKeyMap
A new “weak map” concept implementation. Unlike ObjectSpace::WeakMap
, it compares keys by equality (WeakMap
compares by identity), and only references to keys are weak (garbage-collectible).
- Reason: The idea of a new class grew out of increased usage of
ObjectSpace::WeakMap
(which was once considered internal). In many other languages, concept of “weak map” implies only key references are weak: this allows to use it as a generic “holder of some additional information related to a set of objects while they are alive,” or just a weak set of objects (using them as keys andtrue
as values): caches, deduplication sets, etc. - Discussion: Feature #18498
- Documentation:
ObjectSpace::WeakKeyMap
- Code:
map = ObjectSpace::WeakKeyMap.new key = "foo" map[key] = true map["foo"] #=> true -- compares by equality, even if two strings are different objects # "Just return the equal key" API, always returns the key's object map.getkey("foo") #=> "foo" map.getkey("foo").object_id == key.object_id #=> true key = nil GC.start map["foo"] #=> nil -- the key was garbage-collected, so the pair was removed # One of the possible usages: a lightweight uniqueness cache for # many small objects: class Money < Data.define(:amount, :currency) def self.new(...) value = super(...) @cache ||= ObjectSpace::WeakKeyMap.new if (existing = @cache.getkey(value)) existing else @cache[value] = true end end end m1 = Money.new(10, 'USD') m2 = Money.new(10, 'USD') m1.object_id #=> 60 m2.object_id #=> 60 # Same values, it is the same object, so there wouldn't be a huge memory # penalty when thousands of similar values are created. # No references to "10 USD" object left m1 = nil m2 = nil GC.start m3 = Money.new(10, 'USD') m3.object_id #=> 80 # The unused values got garbage-collected, so the cache wouldn't just grow forever
- Notes: The class interface is significantly leaner than
WeakMap
’s, and doesn’t provide any kind of iteration methods (which is very hard to implement and use correctly with weakly-referenced objects), so the new class is more like a black box with associations than a collection.
ObjectSpace::WeakMap#delete
- Reason:
WeakMap
is frequently used to have a loose list of objects that will need some processing at some point of program execution if they are still alive/used (that’s whyWeakMap
and notArray
/Hash
is chosen in those cases). But it is possible that the code author wants to process objects conditionally, and to remove those which don’t need processing anymore—even if they are still alive.WeakMap
quacks like kind-of simpleHash
, yet previously provided no way to delete keys. - Discussion: Feature #19561
- Documentation:
ObjectSpace::WeakMap#delete
- Code:
files_to_close = ObjectSpace::WeakMap.new file1 = File.new('README.md') file2 = File.new('NEWS.md') files_to_close[file1] = true files_to_close[file2] = true files_to_close.delete(file1) #=> true # Attempt to delete non-existing key: files_to_close.delete(file1) #=> nil # An optional block can be provided in case the key doesn't exist: files_to_close.delete(file1) { puts "Already removed"; 0 } # Prints "Already removed", returns `0` # The block wouldn't be called if the deletion was effectful: files_to_close.delete(file2) { puts "Already removed"; 0 } # Prints nothing, returns true
Thread::Queue#freeze
and SizedQueue#freeze
raise TypeError
- Reason: The discussion was started with a bug report about
Queue
not respecting#freeze
in any way (#push
and#pop
were still working after#freeze
call). It was then decided that allowing to freeze a queue like any other collection (leaving it immutable) would have questionable semantics. AsQueue
is meant to be an inter-thread communication utility, freezing a queue while some thread waits for it would either leave this thread hanging, or would require#freeze
’s functionality to extend for communication with dependent threads. Neither is a good option, so the behavior of the method was changed to communicate that queue freezing doesn’t make sense. - Discussion: Bug #17146
- Documentation:
Thread::Queue#freeze
andThread::SizedQueue#freeze
Range
#reverse_each
Specialized Range#reverse_each
method is implemented.
- Reason: Previously,
Range
didn’t have a specialized#reverse_each
method, so calling it invoked a genericEnumerable#reverse_each
. The latter works by converting the object to array, and then enumerating this array. In case of aRange
this can be inefficient (producing large arrays) or impossible (when only upper bound of the range is defined). It also went into infinite loop with endless ranges, trying to enumerate it all to convert into array, while the range can say beforehand that it would be impossible. - Discussion: Feature #18515
- Documentation:
Range#reverse_each
- Code:
# Efficient implementation for integers: (1..2**100).reverse_each.take(3) # Ruby 3.2: hangs on my machine, trying to produce an array # Ruby 3.3: #=> [1267650600228229401496703205376, 1267650600228229401496703205375, 1267650600228229401496703205374] # (returns immediately) (...5).reverse_each.take(3) # Ruby 3.2: can't iterate from NilClass (TypeError) # Ruby 3.3: #=> [5, 4, 3] # Explicit error for endless ranges: (1...).reverse_each # Ruby 3.2: hangs forever, trying to produce an array # Ruby 3.3: `reverse_each': can't iterate from NilClass (TypeError) # The latter change affects any type of range beginning: ('a'...).reverse_each # Ruby 3.2: hangs forever, trying to produce an array # Ruby 3.3: `reverse_each': can't iterate from NilClass (TypeError)
- Notes: Other than raising
TypeError
for endless ranges (which works with any type of range beginning), the specialized behavior is only implemented forInteger
. A possibility of a generalization was discussed by using object’s#pred
method (opposite to#succ
, which the range uses to iterate forward), but the scope of this change would be bigger, as currently onlyInteger
implements such method. It is possible that the adjustments would be made in the future versions.
#overlap?
Checks for overlapping of two ranges.
- Discussion: Feature #19839
- Documentation:
Range#overlap?
- Code:
(1..3).overlap?(2..5) #=> true (1..3).overlap?(4..5) #=> false (..3).overlap?(3..) #=> true (1...3).overlap?(3..5) #=> false, the first range doesn't include 3 (1..3).overlap?(3...3) #=> false, the second range is empty (note it has an exclusive end) (1..3).overlap?('a'..'c') #=> false, ranges are incompatible (but not an exception) (1..3).overlap?(1) # `overlap?': wrong argument type Integer (expected Range) (TypeError)
- Notes: As documentation points out, the technically empty
(...-Float::INFINITY)
range (nothing can be lower thanFloat::INFINITY
, and it is not included) still considered overlapping with itself by this method:(...-Float::INFINITY).overlap?(...-Float::INFINITY) #=> true # Same with other "nothing could be smaller" ranges: (..."").overlap?(..."") #=> true
(Though, with Ruby’s dynamic nature, one technically can define an object that will report itself to be smaller than an empty string, and therefore belong to a range… Making it non-empty.)
Filesystem and IO
Dir.for_fd
and Dir.fchdir
Two methods to accept an integer file descriptor as an argument: for_fd
creates a Dir
object from it; fchdir
changes the current directory to one specified by a descriptor.
- Reason: New methods allow to use UNIX file descriptors if they are returned from a C-level code or obtained from OS.
- Discussion: Feature #19347
- Documentation:
Dir.for_fd
,Dir.fchdir
- Code:
fileno = Dir.new('doc/').fileno # In reality, this #fileno might come from other library dir = Dir.for_fd(fileno) #=> #<Dir:0x00007f8831b810a8> -- no readable path representation dir.path #=> nil dir.to_a #=> ["forwardable.rd.ja", "packed_data.rdoc", "marshal.rdoc", "format_specifications.rdoc", .... # It was performed in the Ruby's core folder, and lists the doc/ contents # Attempt to use a bogus fileno will result in error: Dir.for_fd(0) # `for_fd': Not a directory - fdopendir (Errno::ENOTDIR) # Same with fileno that doesn't designate a directory: Dir.for_fd(Dir.new('README.md').fileno) # in `initialize': Not a directory @ dir_initialize - README.md (Errno::ENOTDIR) # Same logic works for .fchdir Dir.fchdir(fileno) #=> 0 Dir.pwd #=> "/home/zverok/projects/ruby/doc" -- the current path have changed successfully # A block form of fchdir is available, like for a regular .chdir: Dir.fchdir(Dir.new('NEWS').fileno) do |*args| p args #=> [] -- no arguments are passed into the block p Dir.pwd #=> "/home/zverok/projects/ruby/doc/NEWS" 'return value' end #=> "return value" Dir.pwd #=> "/home/zverok/projects/ruby/doc" -- back to the path before the block
- Notes:
- The functionality is only supported on POSIX platforms;
- The initial ticket only proposed to find a way to be able to change a current directory to one specified by a descriptor (i.e., what eventually became
.fchdir
), but during the discussion a need were discovered for a generic instantiation of aDir
instance from the descriptor (what becamefrom_fd
), as well as a generic way to change the current directory to one specified byDir
instance (#chdir
, which is not related to descriptors but is generically useful).
Dir#chdir
An instance method version of Dir.chdir
: changes the current working directory to one specified by the Dir
instance.
- Discussion: Feature #19347
- Documentation:
Dir#chdir
- Code:
Dir.pwd #=> "/home/zverok/projects/ruby" dir = Dir.new('doc') dir.chdir #=> nil Dir.pwd #=> "/home/zverok/projects/ruby/doc" # The block form works, too: Dir.new('NEWS').chdir do |*args| p args #=> [] -- no arguments are passed into the block Dir.pwd #=> "/home/zverok/projects/ruby/doc/NEWS" 'return value' end #=> "return value" Dir.pwd #=> "/home/zverok/projects/ruby/doc"
Deprecate subprocess creation with method dedicated to files
- Reason: Methods that are dedicated for opening/reading a file by name historically supported the special syntax of the argument: if it started with pipe character
|
, the subprocess was created and could’ve been used to communicate with an external command. The functionality is still explained in Ruby 3.2 docs. It, though, created a security vulnerability: even when the program’s author didn’t rely on that behavior, the malicious string could’ve been passed by the attacker instead of an innocent filename. - Discussion: Feature #19630
- Affected methods:
Kernel#open
IO.binread
IO.foreach
IO.readlines
IO.read
IO.write
URI.open
(open-uri standard library)
- Code:
IO.read('| ls') #=> contents of the current folder Warning[:deprecated] = true # Or pass -w command-line option IO.read('| ls') # warning: Calling Kernel#open with a leading '|' is deprecated and will be removed in Ruby 4.0; use IO.popen instead #=> contents of the current folder
- Notes:
- The documentation for the corresponding methods was adjusted accordingly. Compare the documentation for
Kernel#open
from3.2
(explains and showcases the|
trick) and3.3
(just mentions that there is a vulnerability to command injection attack). - As advised by the warning,
IO.popen
is a specialized method when communicating with an external process is desired functionality:IO.popen('ls') #=> contents of the current folder
- As the impact of the change might be big, note that target version for removal is set to 4.0. To the best of my knowledge, there are no set date for major version yet.
- The documentation for the corresponding methods was adjusted accordingly. Compare the documentation for
NoMethodError
: change of rendering logic
NoMethodError
doesn’t use target object’s #inspect
in its message, and renders “instance of ClassName” instead.
- Reason: While the
#inspect
of the object which failed to respond might be convenient in the error’s output, it also might be extremely inefficient and confusing when the object is large and doesn’t have#inspect
redefined to something sensible. It is impossible to require all user objects to redefine#inspect
, and even if it is redefined, it might be short yet inefficient; so the lesser of evils was chosen and exception’s message became more efficient even if less informative. - Documentation:
NoMethodError
- Code:
"hello".to_ary # Ruby 3.2: undefined method `to_ary' for "hello":String (NoMethodError) # Ruby 3.3: undefined method `to_ary' for an instance of String (NoMethodError) # But also, for some complicated data structure: ([{name: 'User 1', role: 'admin'}] * 100).to_josn # typo # Ruby 3.2: undefined method `to_josn' for [{:name=>"User 1", :role=>"admin"}, {:name=>"User 1", :role=>"admin"}, ... # ....10 lines of console output.... # ..., {:name=>"User 1", :role=>"admin"}]:Array (NoMethodError) # # Ruby 3.3: undefined method `to_josn' for an instance of Array (NoMethodError)
Fiber#kill
Terminates the Fiber by sending an exception inside it.
- Reason: The method is intended to be used to fibers that represent processes that need to be told explicitly to finalize themselves (invoking any
ensure
operations and cleanups that are necessary). If such fiber just abandoned and collected by a GC, it wouldn’t invoke fiber’sensure
, and therefore the resources wouldn’t be cleaned; so there was need for a way to do this explicitly. - Discussion: Bug #595
- Documentation:
Fiber#kill
- Code:
f = Fiber.new do (1..).each { Fiber.yield _1 } ensure puts "Closing myself" end #=> #<Fiber:0x0... (created)> f.resume #=> 1 f.resume #=> 2 f #=> #<Fiber:0x0... (suspended)> f.kill # Prints: "Closing myself" f #=> #<Fiber:0x0... (terminated)> f.resume # `resume': attempt to resume a terminated fiber (FiberError) # Semi-realistic usage example: reader = Fiber.new do conn = SomeConnection.open(**params) while conn.open? Fiber.yield conn.read end ensure conn.close end headers = reader.resume # reads something from the connection body_line1 = reader.resume # reads some more # Now, if we want to explicitly stop reading and be sure that the connection # is closed, we might do this: reader.kill # invokes #ensure
- Notes:
- The exception sent to Fiber is uncatchable (so no
rescue Exception
will notice it), so it can’t be said that it has some class; the only usage of the fact that it is raised through exception mechanism is invokingensure
block; - The fibers that was invoking the killed one with
resume
ortransfer
, receivesnil
from that call;f1 = Fiber.new { # Instead of yielding something back, the fiber kills itself Fiber.current.kill } f2 = Fiber.new { result = f1.transfer p(result:) } f2.resume # prints: {:result => nil}
- Only fibers belonging to the same thread can be killed.
- The exception sent to Fiber is uncatchable (so no
Internals
New Warning
category: :performance
A new warning category was introduced for a code that is correct but is known to produces a performance problems. One new such warning was added for objects with too many “shape” variations.
- Discussion: Feature #19538
- Documentation:
Warning#[category]
- Code: Here is an example of the new warning in play:
class C def initialize(i) instance_variable_set("@var_#{i}", i**2) end end Warning[:performance] = true # or pass `-W:performance` command-line argument (1..10).map { C.new(_1) } # warning: Maximum shapes variations (8) reached by C, instance variables accesses will be slower.
The example is artificial, but it shows the principle: when we have more than 8 instances of the same class, but with different list of instance variables (shape), we might have a performance problem. This means, for example, that a frequently-used class that has many methods with a memoization idiom (
@var ||= value
on the first access) would create the same problem, unless all of them would be initialized in theinitialize
, making all instances having the same shape:class C # 9 different getters that create an instance varaible # on the first access. def var1 = @var1 ||= rand def var2 = @var2 ||= rand def var3 = @var3 ||= rand def var4 = @var4 ||= rand def var5 = @var5 ||= rand def var6 = @var6 ||= rand def var7 = @var7 ||= rand def var8 = @var8 ||= rand def var9 = @var9 ||= rand end Warning[:performance] = true # Invoking different getters on different instances of the same class makes # them have different set of instance variables. (1..9).map { C.new.send("var#{_1}") } # warning: Maximum shapes variations (8) reached by C, instance variables accesses will be slower. # But if we add this to initialize: class C def initialize @var1, @var2, @var3, @var4, @var5, @var5, @var6, @var7, @var8, @var9 = nil end end (1..9).map { C.new.send("var#{_1}") } # no warning. All objects have the same list of instance vars = the same shape
- Notes:
- The warning category should be turned on explicitly by providing
-W:performance
CLI option orWarning[:performance] = true
from the program.
- The warning category should be turned on explicitly by providing
- Additional reading: Performance impact of the memoization idiom on modern Ruby by Ruby core team member Jean Boussier.
Process.warmup
A method to call when a long-running application finalized its loading, and before the regular work is started.
- Discussion: Feature #18885
- Documentation:
Process.warmup
- Notes: Hardly something can be explained or showcased here better than the justification discussion linked above and the method docs are doing it.
Process::Status#&
and #>>
are deprecated
- Reason: These methods have been treating
Process::Status
as a very thin wrapper around an integer value of the return status of the process; which is unreasonable for supporting Ruby in more varying environments. - Discussion: Bug #19868
- Documentation:
Process::Status#&
,#>>
TracePoint
supports :rescue
event
Allows to trace when some exception was rescue
‘d in the code of interest.
- Discussion: Feature #19572
- Documentation:
TracePoint#Events
- Code:
TracePoint.trace(:rescue) do |tp| puts "Exception rescued: #{tp.raised_exception.inspect} at #{tp.path}:#{tp.lineno}" end begin raise "foo" rescue => e end # Prints: "Exception rescued: #<RuntimeError: foo> at example.rb:7
- Notes: The event-specific attribute for the event is the same as for
:raise
:#raised_exception
.
Standard library
Since Ruby 3.1 release, most of the standard library is extracted to either default or bundled gems; their development happens in separate repositories, and changelogs are either maintained there, or absent altogether. Either way, their changes aren’t mentioned in the combined Ruby changelog, and I’ll not be trying to follow all of them.
stdgems.org project has a nice explanations of default and bundled gems concepts, as well as a list of currently gemified libraries and links to their docs.
“For the rest of us” this means libraries development extracted into separate GitHub repositories, and they are just packaged with main Ruby before release. It means you can do issue/PR to any of them independently, without going through more tough development process of the core Ruby.
A few changes to mention, though:
BasicSocket#recv
andBasicSocket#recv_nonblock
returnsnil
instead of an empty string on closed connections.BaicSocket#recvmsg
andBasicSocket#recvmsg_nonblock
returnsnil
instead of an empty packet on closed connections. Discussion: Bug #19012- Name resolution such as
Socket.getaddrinfo
,Socket.getnameinfo
,Addrinfo.getaddrinfo
can now be interrupted. Discussion: Feature #19965 Random::Formatter#alphanumeric
:chars
keyword argument. Feature #18183:require 'random/formatter' # The default behavior: uses English alphabet + numbers Random.alphanumeric #=> "fhCshEkcGfCTO6Ny" # With the argument provided: Random.alphanumeric(chars: ['a', 'b', 'c']) #=> "cbacacbababccccc" # Note that the argument should be an array. # So if you have a string of characters, you can do: Random.alphanumeric(chars: 'abc'.chars) #=> "abbaccaacbacbccc" # Any object is acceptable as an array element; the method # would just use their `#to_s`; arrays would be flattened: Random.alphanumeric(chars: [1, true, [2], Object.new]) #=> "111true11211true2true#<Object:0x00007fe804e79f48>221" # An empty array just hangs forever: Random.alphanumeric(chars: []) # never returns
- There were many amazing changes in Ruby’s console IRB. See the article by IRB maintainer Stan Lo: Unveiling the big leap in Ruby 3.3’s IRB.
Version updates
Default gems
- RubyGems 3.5.3
- abbrev 0.1.2
- base64 0.2.0
- benchmark 0.3.0
- bigdecimal 3.1.5
- bundler 2.5.3
- cgi 0.4.1
- csv 3.2.8
- date 3.3.4
- delegate 0.3.1
- drb 2.2.0
- english 0.8.0
- erb 4.0.3
- error_highlight 0.6.0
- etc 1.4.3
- fcntl 1.1.0
- fiddle 1.1.2
- fileutils 1.7.2
- find 0.2.0
- getoptlong 0.2.1
- io-console 0.7.1
- io-nonblock 0.3.0
- io-wait 0.3.1
- ipaddr 1.2.6
- irb 1.11.0 (Releases page, blog post)
- json 2.7.1
- logger 1.6.0
- mutex_m 0.2.0
- net-http 0.4.0
- net-protocol 0.2.2
- nkf 0.1.3
- observer 0.1.2
- open-uri 0.4.1
- open3 0.2.1
- openssl 3.2.0
- optparse 0.4.0
- ostruct 0.6.0
- pathname 0.3.0
- pp 0.5.0
- prettyprint 0.2.0
- pstore 0.1.3
- psych 5.1.2
- rdoc 6.6.2
- readline 0.0.4
- reline 0.4.1
- resolv 0.3.0
- rinda 0.2.0
- securerandom 0.3.1
- set 1.1.0
- shellwords 0.2.0
- singleton 0.2.0
- stringio 3.1.0
- strscan 3.0.7
- syntax_suggest 2.0.0
- syslog 0.1.2
- tempfile 0.2.1
- time 0.3.0
- timeout 0.4.1
- tmpdir 0.2.0
- tsort 0.2.0
- un 0.3.0
- uri 0.13.0
- weakref 0.1.3
- win32ole 1.8.10
- yaml 0.3.0
- zlib 3.1.0
Bundled gems
- minitest 5.20.0
- rake 13.1.0
- test-unit 3.6.1
- rexml 3.2.6
- rss 0.3.0
- net-ftp 0.3.3
- net-imap 0.4.9
- net-smtp 0.4.0
- rbs 3.4.0
- typeprof 0.21.9
- debug 1.9.1
Standard library content changes
New libraries
- prism (nee YARP) is added. It is a new Ruby code parser, developed by Kevin Newton, which intends to become the Ruby parser, shared by all implementations (not only CRuby/MRI, but also TruffleRuby, JRuby, and others) and tools that need to parser Ruby code (like Sorbet or Rubocop). It doesn’t replace CRuby’s Ruby parser, at least for now, but can be used to parse Ruby quickly and produce robust, easy to use AST.
- Documentation:
Prism
(it isn’t very well rendered in the standard library docs, so the official site is recommended); - Note: You can run Ruby with Prism as its main parser with
--parser=prism
, but it is only for experimentation and debugging for now.
- Documentation:
Removals
readline
extension is removed. It was a standard library written in C to wrap GNU Readline, used to implement interactive consoles like IRB. Ruby includes pure-Ruby replacement called reline since 2.7, and nowrequire 'readline'
will just require it and make an aliasReadline = Reline
. Though, if the readline-ext gem is installed explicitly,require 'readline'
would require it. Discussion: Feature #19616.require 'readline' Readline # Ruby 3.2: # => Readline -- a separate library/constant # Ruby 3.3: # => Reline -- Readline is just an alias Readline.method(:readline) # Ruby 3.2: # => #<Method: Readline.readline(*)> -- a C-defined method with no location/signature extracted # Ruby 3.3: # => #<Method: Reline.readline(*args, &block) <...>/lib/ruby/3.3.0+0/forwardable.rb:231>
Default gems that became bundled
This means that if your dependencies are managed by Bundler and your code depend on racc
, it should be added to a Gemfile
.
- racc 1.7.3
Gems that are warned to become bundled in the next version
These gems wouldn’t in a Bundler-managed environment unless explicitly added to Gemfile
since the next version of Ruby. For now, requiring them in such environment would produce a warning. Discussion: Feature #19351 (initial proposal to promote many gems, which then was deemed problematic), Feature #19776 (the warning proposal)