49  Managing variables

Rizin allows managing local variables, no matter their location, stack or registers. The variables’ auto analysis is enabled by default but can be disabled with analysis.vars configuration option.

The main variables commands are located in afv namespace:

[0x00001100]> afv?
Usage: afv<?>   # Manipulate arguments/variables in a function
| afvl[j*lt]               # List all variables and arguments of the current function
| afv=                     # List function variables and arguments with disasm refs
| afv- <varname|*>         # Remove all variables/arguments or just the specified one
| afva                     # Analyze function arguments/locals
| afvd [<varname>]         # Display the value of arguments/variables
| afvf                     # Show BP relative stackframe variables
| afvn <new_name> [<old_name>] # Rename argument/variable in current function
| afvR [<varname>]         # List addresses where vars are accessed (READ)
| afvW [<varname>]         # List addresses where vars are accessed (WRITE)
| afvt <varname> <type>    # Change type for given argument/local
| afvx[jav]                # Show argument/variable xrefs in a function
| afvs[j*-gs?]             # Manipulate stack-based arguments/locals
| afvr[j*-gs?]             # Manipulate register-based arguments/locals

afvr and afvs commands are uniform but allow manipulation of register-based arguments and variables, and SP-based arguments and variables respectively. If we check the help for afvr we will get the way two others commands works too:

[0x00001100]> afvr?
Usage: afvr[j*-gs?]   # Manipulate register-based arguments/locals
| afvr[j*] [<reg> <name> [<type>]] # List register-based arguments and locals / Define a new one
| afvr- <varname>            # Delete register-based argument/local with the given name
| afvr-*                     # Delete all register-based arguments/locals
| afvrg <reg> <addr>         # Define register-based arguments and locals get references
| afvrs <reg> <addr>         # Define register-based arguments and locals set references

Like many other things variables detection is performed by Rizin automatically, but results can be changed with those arguments/variables control commands. This kind of analysis relies heavily on preloaded function prototypes and the calling-convention, thus loading symbols can improve it. Moreover, after changing something we can rerun variables analysis with afva command. Quite often variables analysis is accompanied by types analysis, see afta command.

The most important aspect of reverse engineering - naming things. Of course, you can rename a variable too, affecting all places it was referenced. This can be achieved with afvn for any type of argument or variable. Or you can simply remove the variable or argument with afv- command.

As mentioned before the analysis loop relies heavily on types information while performing variables analysis stages. Let’s see all variables that found by Rizin:

[0x000011e9]> afvs
var unknown_t var_48h @ stack - 0x48
var unknown_t var_3ch @ stack - 0x3c
var unknown_t var_30h @ stack - 0x30
var unknown_t var_28h @ stack - 0x28
var unknown_t var_20h @ stack - 0x20
var unknown_t var_18h @ stack - 0x18
var unknown_t var_10h @ stack - 0x10

unknown_t means an “undefined” default type for variable whose exact type can’t be inferred. Thus comes very important command - afvt, which allows you to change the type of variable:

[0x000011e9]> afvt var_48h const char *

[0x000011e9]> afvs
var const char *var_48h @ stack - 0x48
var unknown_t var_3ch @ stack - 0x3c
var unknown_t var_30h @ stack - 0x30
var unknown_t var_28h @ stack - 0x28
var unknown_t var_20h @ stack - 0x20
var unknown_t var_18h @ stack - 0x18
var unknown_t var_10h @ stack - 0x10

Less commonly used feature, which is still under heavy development - distinction between variables being read and written. You can list those being read with afvR command and those being written with afvW command. Both commands provide a list of the places those operations are performed:

[0x000011e9]> afvR
   var_3ch
   var_48h
   var_30h  0x1212,0x1254
   var_28h  0x1222,0x1267
   var_20h  0x1232
   var_18h  0x1236
   var_10h  0x124d,0x1258,0x126b,0x127a,0x1286

[0x00003b92]> afvW
[0x000011e9]> afvW
   var_3ch  0x11f5
   var_48h  0x11f8
   var_30h  0x1203
   var_28h  0x120e
   var_20h  0x121e
   var_18h  0x122e
   var_10h  0x1249
[0x00003b92]>

49.1 Type inference

The type inference for local variables and arguments is well integrated with the command aft.

Let’s see an example of this with a simple hello_world binary:

[0x00001100]> aa
[x] Analyze all flags starting with sym. and entry0 (aa)

[0x00001100]> s main

[0x000011e9]> pdf
            ; DATA XREF from entry0 @ 0x1118
/ int main(int argc, char **argv, char **envp);
|           ; arg int argc @ rdi
|           ; arg char **argv @ rsi
|           ; var int64_t var_48h @ stack - 0x48
|           ; var int64_t var_3ch @ stack - 0x3c
|           ; var int64_t var_30h @ stack - 0x30
|           ; var int64_t var_28h @ stack - 0x28
|           ; var int64_t var_20h @ stack - 0x20
|           ; var int64_t var_18h @ stack - 0x18
|           ; var int64_t var_10h @ stack - 0x10
|           0x000011e9      endbr64
|           0x000011ed      push  rbp
|           0x000011ee      mov   rbp, rsp
|           0x000011f1      sub   rsp, 0x40
|           0x000011f5      mov   dword [var_3ch], edi                 ; argc
|           0x000011f8      mov   qword [var_48h], rsi                 ; argv
|           0x000011fc      lea   rax, [str.Hello]                     ; 0x2004 ; "Hello "
|           0x00001203      mov   qword [var_30h], rax
|           0x00001207      lea   rax, [str.world]                     ; 0x200b ; "world!"
|           0x0000120e      mov   qword [var_28h], rax
|           0x00001212      mov   rax, qword [var_30h]
|           0x00001216      mov   rdi, rax
|           0x00001219      call  sym.imp.strlen                       ; sym.imp.strlen ; size_t strlen(const char *s)
|           0x0000121e      mov   qword [var_20h], rax
|           0x00001222      mov   rax, qword [var_28h]
|           0x00001226      mov   rdi, rax
|           0x00001229      call  sym.imp.strlen                       ; sym.imp.strlen ; size_t strlen(const char *s)
|           0x0000122e      mov   qword [var_18h], rax
|           0x00001232      mov   rdx, qword [var_20h]
|           0x00001236      mov   rax, qword [var_18h]
|           0x0000123a      add   rax, rdx
|           0x0000123d      add   rax, 1
|           0x00001241      mov   rdi, rax
|           0x00001244      call  sym.imp.malloc                       ; sym.imp.malloc ; void *malloc(size_t size)
|           0x00001249      mov   qword [var_10h], rax
|           0x0000124d      cmp   qword [var_10h], 0
|       ,=< 0x00001252      je    0x1292
|       |   0x00001254      mov   rdx, qword [var_30h]
|       |   0x00001258      mov   rax, qword [var_10h]
|       |   0x0000125c      mov   rsi, rdx
|       |   0x0000125f      mov   rdi, rax
|       |   0x00001262      call  sym.imp.strcpy                       ; sym.imp.strcpy ; char *strcpy(char *dest, const char *src)
|       |   0x00001267      mov   rdx, qword [var_28h]
|       |   0x0000126b      mov   rax, qword [var_10h]
|       |   0x0000126f      mov   rsi, rdx
|       |   0x00001272      mov   rdi, rax
|       |   0x00001275      call  sym.imp.strcat                       ; sym.imp.strcat ; char *strcat(char *s1, const char *s2)
|       |   0x0000127a      mov   rax, qword [var_10h]
|       |   0x0000127e      mov   rdi, rax
|       |   0x00001281      call  sym.imp.puts                         ; sym.imp.puts ; int puts(const char *s)
|       |   0x00001286      mov   rax, qword [var_10h]
|       |   0x0000128a      mov   rdi, rax
|       |   0x0000128d      call  sym.imp.free                         ; sym.imp.free ; void free(void *ptr)
|       `-> 0x00001292      mov   eax, 0
|           0x00001297      leave
\           0x00001298      ret

After applying aft:

[0x000011e9]> aeim

[0x000011e9]> aft

[0x000011e9]> pdf
            ; DATA XREF from entry0 @ 0x1118
            ;-- rip:
/ int main(int argc, char **argv, char **envp);
|           ; arg int argc @ rdi
|           ; arg char **argv @ rsi
|           ; var char **var_48h @ stack - 0x48
|           ; var int var_3ch @ stack - 0x3c
|           ; var const char *src @ stack - 0x30
|           ; var const char *s2 @ stack - 0x28
|           ; var size_t var_20h @ stack - 0x20
|           ; var size_t size @ stack - 0x18
|           ; var char *dest @ stack - 0x10
|           0x000011e9      endbr64
|           0x000011ed      push  rbp
|           0x000011ee      mov   rbp, rsp
|           0x000011f1      sub   rsp, 0x40
|           0x000011f5      mov   dword [var_3ch], edi                 ; argc
|           0x000011f8      mov   qword [var_48h], rsi                 ; argv
|           0x000011fc      lea   rax, [str.Hello]                     ; 0x2004 ; "Hello "
|           0x00001203      mov   qword [src], rax
|           0x00001207      lea   rax, [str.world]                     ; 0x200b ; "world!"
|           0x0000120e      mov   qword [s2], rax
|           0x00001212      mov   rax, qword [src]
|           0x00001216      mov   rdi, rax                             ; const char *s
|           0x00001219      call  sym.imp.strlen                       ; sym.imp.strlen ; size_t strlen(const char *s)
|           0x0000121e      mov   qword [var_20h], rax
|           0x00001222      mov   rax, qword [s2]
|           0x00001226      mov   rdi, rax                             ; const char *s
|           0x00001229      call  sym.imp.strlen                       ; sym.imp.strlen ; size_t strlen(const char *s)
|           0x0000122e      mov   qword [size], rax
|           0x00001232      mov   rdx, qword [var_20h]
|           0x00001236      mov   rax, qword [size]
|           0x0000123a      add   rax, rdx
|           0x0000123d      add   rax, 1
|           0x00001241      mov   rdi, rax                             ; size_t size
|           0x00001244      call  sym.imp.malloc                       ; sym.imp.malloc ; void *malloc(size_t size)
|           0x00001249      mov   qword [dest], rax
|           0x0000124d      cmp   qword [dest], 0
|       ,=< 0x00001252      je    0x1292
|       |   0x00001254      mov   rdx, qword [src]
|       |   0x00001258      mov   rax, qword [dest]
|       |   0x0000125c      mov   rsi, rdx                             ; const char *src
|       |   0x0000125f      mov   rdi, rax                             ; char *dest
|       |   0x00001262      call  sym.imp.strcpy                       ; sym.imp.strcpy ; char *strcpy(char *dest, const char *src)
|       |   0x00001267      mov   rdx, qword [s2]
|       |   0x0000126b      mov   rax, qword [dest]
|       |   0x0000126f      mov   rsi, rdx                             ; const char *s2
|       |   0x00001272      mov   rdi, rax                             ; char *s1
|       |   0x00001275      call  sym.imp.strcat                       ; sym.imp.strcat ; char *strcat(char *s1, const char *s2)
|       |   0x0000127a      mov   rax, qword [dest]
|       |   0x0000127e      mov   rdi, rax                             ; const char *s
|       |   0x00001281      call  sym.imp.puts                         ; sym.imp.puts ; int puts(const char *s)
|       |   0x00001286      mov   rax, qword [dest]
|       |   0x0000128a      mov   rdi, rax                             ; void *ptr
|       |   0x0000128d      call  sym.imp.free                         ; sym.imp.free ; void free(void *ptr)
|       `-> 0x00001292      mov   eax, 0
|           0x00001297      leave
\           0x00001298      ret

It also extracts type information from format strings like printf ("fmt : %s , %u , %d", ...), the format specifications are extracted from analysis/d/spec.sdb

You could create a new profile for specifying a set of format chars depending on different libraries/operating systems/programming languages like this:

win=spec
spec.win.u32=unsigned int

Then change your default specification to a newly created one using this config variable e analysis.spec=win

For more information about primitive and user-defined types support in Rizin refer to types chapter.