Lua introduction for confirmed developers
About Lua
Lua, not LUA
This is a Portuguese word meaning "moon", not an acronym.
Presentation
Lua is an interpreted, dynamically typed, multi-paradigm language. It was designed in 1993 with the following objectives in mind:
- Extensible semantics. A small set of features allows the support of multiple paradigms, from functional programming to object programming. For example, while inheritance is not natively supported, it can be easily set up through metatables. The same features allow events in civ5 to behave both as functions and objects with Add and Remove methods.
- Highly portable. The con is a lightweight and somewhat lacking API.
- Easy to embed and to interface from other languages.
- Data should be easy to describe in Lua (à la JSON).
- A beginners-friendly syntax.
Performances
As usual with interpreted and dynamically typed languages, expect a 10 to 100 order of magnitude in performances decrease when compared to compiled and statically type languages like C, C++ or C#. While it is possible for those languages to drastically improve the performances through more or less sophisticated means (expression-tree or opcodes caching, Jit compiler, type inference to statically resolve code), no such solution has been used in Civ5 (aside, probably, of the caching). And while an external Jit library exists, it causes compatibility problems with most mods using Lua.
That being said, Lua is still pretty good for this family of languages, better than the Python implementation from Civ4 for example. And all in all, it is fast enough for most of what modders want to achieve without the need to compromise code elegance, readability and maintainability for performances. So keep in mind that "premature optimization is the root of all evil" (Donald Knuth).
Syntax cheatsheet
Basics
-- Literals
local x = 5
local x = 5.0
local x = nil -- A nil value is the same as an undefined value
local x = true
local x = false
-- Strings (single and double quotes can be used indifferently, there is no difference)
local x = "five"
local x = 'five'
local x = "five\n" -- Escape characters with "\".
local x = [[Five is a number.
Did you know?]] -- Multi-line string: equivalent to "Five is number.\nDid you know?"
-- Globals versus locals
x = 5 -- For globals, assigning is declaring.
local x = 5
Operators
local x = a or b
local x = a and b
local x = "hello ".."world" .. "!" -- concatenation: two dots
local x = "five is "..5 -- "five is 5". On concatenation, implicit casts occur.
local x = (a + b - d) * f ^ g % h -- ^ is exponentiation, % is modulo
local x = a == b -- equality -- True only if a and b have the same values and types.
local x = a ~= b -- inequality
local x = ((a <= b) == not (a > b)) == ((a >= b) == not (a < b))
local x = "abc" < "def" -- Byte-wise comparison, not Unicode-compliant. Use Locale's methods instead.
local x = #"abc" -- Returns 3, the number of bytes in the string (3 characters, 1 byte each).
local x = #"您好世界" -- Returns 8, the number of bytes in the string (4 characters, 2 bytes each).
local x = #SomeTable -- Returns the biggest integer key in the table, or nil if there is no integer key.
-- No increment operator (++ --), no binary-assign operator (+= *=), no ternary operator (? :), no coalescence operator (??)
Tables and multiple assignments
-- Table declaration
SomeTable = {}
SomeArray = { "abc", "def" }
-- 1 => "abc", 2 => "def"
SomeComplexTable = { name = "John", surname = "Doe", age = 25, 43, [7] = 8 }
-- "name" => "John", "surname" => "Doe", "age" => 25, 1 => 43, 7 => 8.
-- Member access (all equivalent)
SomeTable.SomeMember = 5
SomeTable["SomeMember"] = 5
-- Multiple assignment
-- Both statements are equivalent since SomeArray contains "abc" and "def".
local x, y = "abc", "def"
local x, y = unpack(SomeArray)
Functions
-- Functions declarations (all equivalent)
function HelloWorld(a, b) print("Hello world") end
HelloWorld = function(a, b) print("Hello world") end
-- Functions are first-class objects and can be assigned like any other value!
-- Methods declarations (all equivalent)
SomeTable.HelloWorld(a, b) print("Hello world") end
SomeTable.HelloWorld = function(a, b) print("Hello world") end
-- Instanced methods declarations (all equivalent)
SomeTable.HelloWorldInstanced(self, a, b) print(self.SomeMember) end
SomeTable:HelloWorldInstanced(a, b) print(self.SomeMember) end
-- Functions and methods calls
HelloWorld(a) -- b will be nil in the function
HelloWorld(a, b)
SomeTable.HelloWorld(a, b)
-- Instanced methods calls (all equivalent)
SomeTable:HelloWorldInstanced(a, b)
SomeTable.HelloWorldInstanced(SomeTable, a, b)
-- Multiple return values
function func(a, b) return a, b end
local x, y, z = func(5, 6) -- Assigns 5 to x and 6 to y, z will be nil
-- Variable arguments
function SomeFunc(a, b, ...)
print(a)
print(b)
for i, v in ipairs(arg) do -- arg is a keyword for a table containing the variable arguments
print(v)
end
end
SomeFunc(1, 2, 3, 4, 5) -- prints 1, 2, 3, 4, 5
SomeFunc(1, 2) -- prints 1, 2
Control flow
-- If
if name == "Wu Zetian" then
print("Hello China")
elseif name == "Napoleon" then -- "elseif", not "else if". Beware C developers!
print("Hello France")
else -- no "then" after "else"!
print("Hello unknown")
end
-- For loop (with counter)
for i = 1, 8 do print(i) end -- Prints 1, 2, 3, 4, 5, 6, 7, 8
for i = 1, 8, 2 do print(i) end -- Prints 1, 3, 5, 7
-- For loop (with iterator)
for playerID, pPlayer in pairs(Players) do -- Prints 1 Wu Zetian, 2 Napoleon, ...
print(playerID, pPlayer:GetName())
end
for plot in Plots() do -- Prints 0 0, 0 1, 0 2, ...
print(plot:GetX(), plot:GetY())
end
-- While (i will be 5 in the end)
while i < 5 do
i = i + 1
end
-- Repeat until (i will be 5 in the end)
repeat
i = i + 1
until i == 5
-- Infinite loops
-- Lua supports "break" for all loops but not "continue" instruction does not exist!
do
i = i + 1
if i > 5 then break end
end
Specificities
Type system
You can get a variable's type by using the type function. It takes any value and returns a string. Here are the possible results:
- nil: Lua makes do distinction between an undefined variable and a variable with a nil value. Assigning nil to a variable or table member is equivalent to undefining it.
- number: a 64 bits floating-point number. Lua has no support for integers and only use floating-points. See also #No integer type.
- string: your usual string of characters. In Lua all strings are interned. This speeds up comparisons but it makes allocation of new strings slower.
- boolean: true or false obviously.
- table: any table.
- userdata: objects defined in C for use in Lua. They may not be enumerated through pairs and other limitations. Now the Civ5 API tend to return regular tables whose metatable is of type userdata, this allows us to use the provided objects as regular Lua tables.
- function: remember that functions are first-class objects in Lua.
- thread: coroutines actually. Lua does not support threads.
Strict equality comparisons
Two values are equal only if their types are the same.
This is stricter than C and far stricter than PHP for example.
- 4 is not equal to "4". (number type and string type). In PHP they would be equal.
- 0 is not equal to false (number type and bool type). In C they would be equal.
- 0 is not equal to nil. (number type and nil type). In C they would be equal.
Regarding tables and C-made objects, they are compared by reference.
a = { 5 }
b = { 5 }
c = a
assert(a == c) -- OK
assert(a == b) -- ERROR
Boolean logic
Anything that is neither false nor nil is evaluated to true.
This is pretty simple. Especially, an empty string or the zero number are evaluated to true.
if 0 then print("hello world") end -- prints hello world
Now regarding the and/or operators, two rules to remember:
- The right operand is only evaluated when needed.
false and x()
does not call x() since the result will be false anyway.true or x()
does not call x() since the result will be true anyway.
- The result is the last operand to have been evaluated.
x() and y()
returns x() if it was false, y() otherwise.x() or y()
returns x() if it was true, y() otherwise.
Ternary and coalescence statements
We can exploit the and/or operators behavior:
- Coalescence:
x or y
. It returns x if x is neither nil nor false, y otherwise.
- In other words
result = x or y
is equivalent to:if x then result = x else result = y end
.
- Ternary:
condition and x or y
(as long as x is not nil or false). It returns x if condition was true, y if it was false.
- Indeed the expression becomes:
(condition and x) or y
. If condition is true the and operator returns x, and since x is true, y is not evaluated and or returns x. - In other words
result = condition and x or y
is equivalent to:if condition then result = x else result = y end
.
Arrays versus tables
In Lua, an array is a table whose keys are integers.
Since this is a common programming pattern, a number of things are designed for arrays:
{"abc", "def", "ghi"}
starts at index 1, up to index 3. Arrays typically start at 1.table.insert({}, "abc")
will transform the provided empty table to{"abc"}
table.insert({"abc", "def"}, "ghi")
will transform the provided table to{"abc", "def", "ghi"}
table.insert({"abc", "ghi"}, 2, "def")
will transform the provided table to{"abc", "def", "ghi"}
table.remove({"abc", "def", "ghi"}, 2)
will transform the provided table to{"abc", ""ghi"}
#{"abc", "def", "ghi"}
returns 3, the biggest integer key.
Ipairs versus pairs
You could say that pairs is for table and ipairs for arrays.
- Pairs enumerate all the key/value pairs in an unordered matter.
- Ipairs only enumerates the key/value pairs for consecutive integer keys, in an ascending order, starting from 1 up to the largest key. If at some point a key is missing, the enumeration stops.
- It does not matter whether the table was crated with table.insert or directly by inserting the values through, for example
table[i] = value
.
- It does not matter whether the table was crated with table.insert or directly by inserting the values through, for example
-- Produces: { 1 => "one", 2 => "two", 4 => "four", "name" => "John", "age" => 25 }
local t = { "one", "two", name = "John", age = 25, [4] = "four" }
-- Will print (for example, the order is undetermined): "name", 2, 4, "age", 1
for k in pairs(t) do print(k) end
-- Will print 1, 2
for k in ipairs(t) do print(k) end
-- Both ipairs and pairs return a key and a value, so you could also use:
for k, v in pairs(t) do print(k, v) end
Semicolons and line breaks
Statements need to be separated either by a line break or a semicolon.
- You can use both a semicolon and a line break but then the semicolon is redundant and not required, although it does not cause any problem.
- Line breaks can also be freely added about everywhere and behave as whitespaces most of the time (you can add them after most keywords for example, so you can keep if...then...end structures on one line or split them across multiple lines).
Locals, globals and includes
- The include statement allows you to include a lua file into another one (see Specificities of the Lua implementation in Civ5). When you do so, the globals defined in the included file are imported in the parent file. However the local variables are only visible from within the file that declared them.
- Another specificities of locals: they're only visible after the point they have been declared in the file, while globals are always visible. An example is worth thousand of words.
function Func1()
print(someLocal or "nil")
print(someGlobal or "nil")
end
local someLocal = "foo"
function Func2()
print(someLocal or "nil")
print(someGlobal or "nil")
end
someGlobal = "global"
someLocal = "local"
Func1() -- prints "nil", then "global": someLocal is assigned but not visible from Func1
Func2() -- prints "local", then "global": someLocal is assigned and visible from Func2
- Performances-wise, Lua first searches a variable name in the current scope, then in the parent scope, etcetera until it reaches the highest scope, and finally it searches the global scope. So the deeper a local is declared, the faster it is accessed. On the contrary, globals are the slowest to retrieve.
Semi-advanced topics
Closures
Closures are functions that embed variables that are persisted across calls.
This is typically achieved through a "parent" function that declares a local "closed" variable and returns another child "closure" function using this local. The "closed" variable is then unreachable for anyone but the "closure". And since it the "closed" variable is not in the "closure"'s scope, it is not reinitialized on every call of "closure".
function GiveMeAFunc() -- the parent function
local x = 0 -- the closed variable
return function() -- the closure function embedding "closed"
x = x + 1
return closure
end
end
local func = GiveMeAFunc()
print(func()) -- Prints 1
print(func()) -- Prints 2
print(func()) -- Prints 3
Iterators
Iterators are functions for use in for...in
loops.
Let's consider the following code: for var1, var2, var3 in iterable() do
.
- iterable has the following signature:
iterator, state, var1 iterable()
. It returns a function, a state, and the initial value of var1. - iterator has the following signature:
var1, var2, var3 iterator(state, var1)
. It takes the state and the last value of var1 and returns var1, var2 and var3. - The loop stops when var1 is nil.
- The for...in loop is equivalent to:
local iterator, state, var1, var2, var3
iterator, state, var1 = iterable()
do
var1, var2, var3 = iterator(state, var1)
if var1 == nil then break end
-- Inserts here the user code defined in the for..in loop
end
- Note that we used var1, var2 and var3 for the example but there is no restriction to the number of variables. You could have one or hundred.
Let's write the "ipairs" function.
local function ipairs(table) -- The iterable function
local iterator = function(table, key) -- The iterator function
key = key + 1
return key, table[key]
end
return iterator, table, 0
end
The same with closures: it's easier.
local function ipairs(table) -- The iterable function
local key = 0 -- Both "table" and "key" are closed variables.
local iterator = function() -- The iterator function, a closure: it ignores the arguments provided to it
key = key + 1
return key, table[key]
end
return iterator -- No need to return anything since we do not require any argument.
end
No integer type
Lua offers no support for integers and only has a 64-bits floating point number type.
Actually, this is only troublesome for a few algorithms that use very large integers (>2^52) or require fast integer arithmetics. But many developers misunderstand floating points and believe the problem is broader than it is. For example, does the following assertion surprise you at least a bit: "any integer smaller than 2^52 can be accurately represented with a 64-bits floating point without any error"? If it did, you suffer from some misconceptions and you should better read carefully what's coming.
Many developers are confusing rounding errors with representation errors.
- Rounding errors actually only happen if you need a very high number of digits to represent a number. For example if you are adding a very large number with a small number. Rounding errors are very rarely encountered in day to day to applications. Actually most developers work their whole career without ever encountering them.
- Representation errors are the real problem: 0.1 is expressed in a decimal basis, but it has no finite representation in binary: it requires an infinite number of binary digits. This means that if you write 0.1 in your code it will be translated to a binary number close enough from 0.1 bot not exactly equal to it. So if you sum ten times 0.1, you sill sum ten times something close from 0.1 and the result will not be 1.
So the real problem for most developers only lies in the decimal-binary conversion that arises when you manipulate numbers in your code that do not have a finite representation in binary. As long as you do not use such numbers, there will be no problem. And here are the good news: integers do not suffer from any representation problem! Indeed, any integer can be represented with a finite number of binary digits. This is why the absence of an integer type typically does not cause problems, minus the performance penalty.
Here are a few examples to illustrate that point.
-- This loop works as intended: no representation error and the upper bound is far below 2^52.
local sum = 0
for i = 0, 10000000000, 1 do sum = sum + 1 end
assert(sum == 10000000001)
-- This loop works as intended: 1/16 is a floating-point with an exact representation in binary
local sum = 0
for i = 0, 1, 1/16 do sum = sum + 1 end
assert(sum == 17/16)
-- This loop does NOT work as intended: 0.1 does not have an exact representation in binary.
-- It's the same for Lua, C and almost every programming language.
local sum = 0
for i = 0, 1, 0.1 do sum = sum + 1 end
assert(sum == 1.1) -- throws an error
Advanced topics
Metatables and metamethods
- Every object in Lua, from a table to a userdata, has a metatable. Many objects can share the same metatable.
- It may be used to define fields on all those objects at once, to implement operator overloading for those objects, to set up events handlers for fields assignments, etc.
- In a way, metatables are analogous to virtual tables in object languages, but they are not just for methods, but also for fields and operators.
- You can get an object's metatable through getmetatable and assign it with setmetatable.
- See also the official metatables reference
One example.
-- Define two tables A and B, and set "meta" their metatable
local a, b, meta = {}, {}, {}
setmetatable(a, meta)
setmetatable(b, meta)
-- Define an __index table on the metatable. It contains one method: hello
local metaIndex = { hello = function() print("hello world") };
meta.__index = metaIndex;
-- Print "hello world" twice
a.hello()
b.hello()
Here are the main fields a metatable can have:
- __index: used when you read a member from an object. May be a table or a function:
value func(object, key)
- __newindex: used when you assign a value to an object's member. May be a table or a function:
func(object, key, value)
- __call: used when you invoke the object as a function. Must be a function:
func(object, <args>)
Besides of that, it can also have the following fields for unary operators. They must be functions: result func(object)
- __unm: unary minus
- __len: the sharp operator (#). Can only be used on userdata ; on tables the default operator will always be called.
Finally it can have fields for binary operators. When each of the two operands defined overloads, the left one is used. They must be functions: result func(object)
- __add: addition
- __sub: subtraction
- __mul: multiplication
- __div: division
- __mod: module
- __pow: exponentiation
- __eq: equality comparison. Prior to calling this overload, Lua checks that types are equal.
- __neq: inequality comparison. Prior to calling this overload, Lua checks whether types are different.
- __lt: lesser than comparison. (b >= a) is considered equivalent to (a < b)
- __le: lesser than or equal to comparison. (b > a) is considered equivalent to (a <= b). If __le is not defined, lua will use __lt, considering that (a <= b) is equivalent to not (b < a).
- __concat: concatenation