File Observing exercise
## How to inspect an excecutable program?
For example the _ls_ program on UNIX systems
(some of the commands on this page nees to be checked)
```bash
# copy ls as ls_command
$ cp /bin/ls ls_command
# this will do the same as ls
$ ./ls_command
```
`$ objdump -d ls_command`
(an object file: code that can be "leaked", to run inside another program)
it shows the assembly/assembler code, which is the lowest level of code : <https://en.wikipedia.org/wiki/Assembly_language>
on the left in hexidecimal code/on right human 'readable' language
1st colum: position in hex code, the address
2nd column: byte code, basic file
try:
```bash
- # display hex, bytes, text
$ hexcurse ls_command
# to display in binary
$ hexcurse -b ls_command
```
this shows it in binary 0's and 1's
translation from binary into hexadecimal and ASCII
the way we look at content of files is based on 8 bits
1st column: address
2nd: binary in 8-bit
3rd column: the ascii translation of bits in letters
Intel 80x86 Assembly Language
<http://www.mathemainzel.info/files/x86asmref.html>
OpCodes
different representations of programs, ascii and hex
Finding military forensic tools for observing files.
![binvis.io on the file /bin/ls]( http://etherbox.local/home/pi/images/Screenshot_from_2017-06-07_164407.png )
<
http://binvis.io
>
<http://etherbox.local/var/www/html/binvis.io/>
used for studying viruses, tool was used to find the ransomware, 15 years ago to crack codes. Finding the 'jump'
bites translated to shades of grey/colours instead of letters, in between 00000 is black - 11111 is white
There is always a jump in the programme where the copyright/license should be inserted, if you replace the 'jump' by
90 = no operation you can play!
open your file in Gimp as raw data
![]( http://etherbox.local/home/pi/images/GIMPOpenFileAsImageData.png )
(It's necessary to select "All files" (default is 'all images') in drop down filter and then select "Raw Image Data" as the File Type)
What is the relationship between observation and translation?
- translations here from binary to ? to color of pixel/frequency/ascii
man ascii
'every letter' of the alphabet can be represented by a number
![src: https://en.wikipedia.org/wiki/ASCII#/media/File:US-ASCII_code_chart.png ](http://etherbox.local/home/pi/images/US-ASCII_code_chart.png )
## So ... where is the software?
for this we can use the command 'dd'
/dev/mem contains your computer memory
dd can have an input and output which is either a file or hardrive/memory. Count is the amount of kilobytes it can ... ?
```bash
$ dd if=/dev/mem of=memory.bin count=10000 skip=132143
```
another exercise:
to copy a sector of the phyisical memory or virtual memory:
physical
```bash
$ sudo dd if=/dev/mem of=memfile.dump bs=1 skip=_startingbyte_ count=_totalbytes_
```
ex for 1mb or ram:
```bash
$ sudo dd if=/dev/mem of=memfile.dump bs=1 skip=23500 count=1000000
#virtual same as above but with// sudo dd if=/proc/kcore
```
to cut files to a specific byte size:
```bash
$ dd if=mem1000000.dump of=smaller.raw bs=1 count=3000
```
will cut the file after the first 3000 bytes.
```bash
$ dd if=mem1000000.dump of=smaller.raw bs=1 skip=4000 count=3000
```
same but starting after the first 4000 bytes
"if you can cat, you can dd"
```bash
$ sudo debugfs /dev/sda3
```
"in Linux you can't read the kernel memory addresses, they are protected" while the kernel runs
using bin.visio to visualise the memory
you can see non used areas, where is text ...
files are not continuous streams of bytes, but blocks
## How does a file sit in the harddrive?
### Inode
i-node is an unique number, a location on the harddrive: <https://en.wikipedia.org/wiki/Inode>
software that links back inodes to files on harddrive: debugfs
used for recovering data & forensics
See also: <http://teaching.idallen.com/cst8207/13w/notes/450_file_system.html>
<https://en.wikipedia.org/wiki/Inode_pointer_structure>
List inodes of folders ls -i / or files ls -l
```bash
$ ls -i testfile
```
output example: 8443114 ls_command
or:
```bash
$ stat ls_command
```
example output:
```bash
File: 'ls_command'
Size: 126584 Blocks: 248 IO Block: 4096 regular file
Device: 805h/2053d Inode: 8443114 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 1000/ ana) Gid: ( 1000/ ana)
Access: 2017-06-10 09:12:16.839835082 +0200
Modify: 2017-06-07 16:08:10.298522614 +0200
Change: 2017-06-07 16:08:10.298522614 +0200
Birth: -
```
This is different to permissions of files:
```bash
$ ls -l name_of_file
#example output:
-rwxr-xr-x 1 ana ana 126584 Jun 7 16:08 ls_command
# See the number of free inodes on a filesystem
# The Unix command "df" will display each partition, and give a percentage of the disk blocks that are used.
$ df -i
# example output
Filesystem Inodes IUsed IFree IUse% Mounted on
udev 2010563 556 2010007 1% /dev
tmpfs 2015628 801 2014827 1% /run
/dev/sda5 12992512 1049361 11943151 9% /
tmpfs 2015628 11 2015617 1% /dev/shm
tmpfs 2015628 5 2015623 1% /run/lock
tmpfs 2015628 16 2015612 1% /sys/fs/cgroup
tmpfs 2015628 33 2015595 1% /run/user/1000
# Have a look at your file system (divisions harddisk, types etc)
$ df -T
# output example:
Filesystem Type 1K-blocks Used Available Use% Mounted on
udev devtmpfs 8042252 0 8042252 0% /dev
tmpfs tmpfs 1612504 9676 1602828 1% /run
/dev/sda5 ext4 204416808 156790276 37219672 81% /
tmpfs tmpfs 8062512 244 8062268 1% /dev/shm
tmpfs tmpfs 5120 4 5116 1% /run/lock
tmpfs tmpfs 8062512 0 8062512 0% /sys/fs/cgroup
tmpfs tmpfs 1612504 72 1612432 1% /run/user/1000
# Start forensic/debugging program:
$ sudo debugfs /dev/sda5 (can be sda3)
$ stat <inode_of_file>
or:
$ debugfs
$ stat <number_of_file>
```
this gives the block in which file is saved in your harddrive (fill it in the skip -2)
find your file system (types etc)
```bash
df -T
sudo debugfs /dev/sda3
# Example output:
Inode: 8443114 Type: regular Mode: 0755 Flags: 0x80000
Generation: 1834475523 Version: 0x00000000:00000001
User: 1000 Group: 1000 Size: 126584
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 248
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x593808ca:472c5fd8 -- Wed Jun 7 16:08:10 2017
atime: 0x593b9bd0:c83b7728 -- Sat Jun 10 09:12:16 2017
mtime: 0x593808ca:472c5fd8 -- Wed Jun 7 16:08:10 2017
crtime: 0x593808ca:472c5fd8 -- Wed Jun 7 16:08:10 2017
Size of extra inode fields: 32
EXTENTS:
(0-30):34304529-34304559
# or shorter:
$ sudo debugfs -R "stat <inode_number>" /dev/sda5
# Example output:
Inode: 8443114 Type: regular Mode: 0755 Flags: 0x80000
Generation: 1834475523 Version: 0x00000000:00000001
User: 1000 Group: 1000 Size: 126584
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 248
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x593808ca:472c5fd8 -- Wed Jun 7 16:08:10 2017
atime: 0x593b9bd0:c83b7728 -- Sat Jun 10 09:12:16 2017
mtime: 0x593808ca:472c5fd8 -- Wed Jun 7 16:08:10 2017
crtime: 0x593808ca:472c5fd8 -- Wed Jun 7 16:08:10 2017
Size of extra inode fields: 32
EXTENTS:
(0-30):34304529-34304559
```
this is making me think about the different kinds of/definitions of abstraction
```bash
$ sudo dumpe2fs -h /dev/sda3 to find out block size
$ sudo dd if=/dev/sda3 of=newfile.txt bs=1 count=10 skip=209161134080
```
blocksize: how many blocks it needs to jump in order to get to our file--
- which block size it uses, and the block size is fixed
-
Goal: how does a file sit in a hardrive?
when you remove a file, you only remove the inode,the file sits there until it is overwritten, you 'zero' it, command 'shred' makes it find back always
this happens because it takes less energy to not overwrite it with zero's
defragmenting: reorganises your harddisk so that empty spaces are brought togehter, and it takes less time to go over the index
organization/configuration and reconfiguration of the arrangement of a given material's potential states/state potentials
## Back to the question we started with
The difference between a file and software?
The difference between non-executable and executable files
Perhaps first step is permission (-x)
COM vs EXE files. The EXE has a header with metadata about it's contents. While an COM is "headerless" and thus more open to modification as the system will still attempt to execute.
Any binary can be run. So when a file is corrupt that means that it cannot be rendered/read/made visible because something in the operating system's reading tells it not to finish/present
In essence there is no difference
Why is it necessary to make the distinction?
_everything is a file (unix)_
so: no difference between file and program and data
does it matter if something is a letter, character, ...
it has consequences for who/what can read
if it is executable, is it software?
if it is noise, it is still 'executed'?
It depends if it is gibberish or something that the computer can understand as a command
anything that can run is software?
(what does it mean to 'run')
storing software on a tape ... you can play it as an audiofile?
an audio file is fundamentally different from python,
it has some tolerance, you can play with it filter it
and python is different
ascii also not
you cannot interpret a perl script as a python code
you can frame data in different ways
-> we're touching the border of software here? is it not more about 'tolerance' and 'noise'?
software is language. Ascii is not language
data vs execution of data
asccii is not language, it is a convention to transform letters into bits
There is a difference between data and an executing of data, they have their own realities
a thought experiment:
* can we write a software on a casette tape in a way that it can play music
* i take your python script and put it on a cassette and play it
* is it still a program now that it is stored and it is just sound it is not a script anymore?
encoding/decoding and tolerance ... translation! (can you get it back to a form that it can 'run')
but you can
the permission of natural language vs interrupting a software
a difference between conversion/translation and execution
(Thinking the machine that translates a punchcard into printed form... is it "executing" it ... well not exactly, it's just translating from punchcard form to printed symbol form)
we are dissecting file systems
i would suggest writing a file system that has the side effect of being playable
i don't see a difference between the two
we have no way to put a hard drive into a player
but with the cassette tapes that works
### A story on conversions
Pirate Broadcast: Making money out of cracking games. The code would be broadcasted over FM, and everyone would record at 12:00 the code on their tapes. This tape could then be put in the commodore. Is this data? It's software-as-a-service!
What kind of games? Commodore 64 games
Layers of abstraction, where each have different tolerances / expectations / NOISE
Independence day of software
IBM budled their hardware and software
then there came a birth of software, a legal rule that says you cannot combine the two anymore
multiple births of software: <http://etherbox.local/home/pi/books/9783957960566-No-Software-just-Services.pdf#page=21>
use of data as a way to optimize service cfr example of offering login service to websites / register users' time of logins, scan for anomalities, resell a service of alarm when a user signs in on irregular moment
"Software as configuration"
turing already said: data = programme
so why try to separate?
In Big Data software + data are inseparable.
abstraction ... different usages of that term today
abstracting away from its political / economical reality
Suggestion to create more a field of values, avoid a duality (data/program) ... When did the line start to get blurry?
## wrap up
This excercise has put the focus on the location of software, where it sits in the harddrive, instead of focussing on what it does
It would be a good idea to sketch lines of the terminologies, and see where these moments appear.
NEXT DAY 8-6 REHEARSAL
how to all share terminal
1. in the observed terminal, just type
$ screen
2. for the observers:
## log into a machine
$ ssh observatory@10.9.8.225
pass: techno
$ screen -rx
$ echo hello
START
## file that generates constant zeros, random set of number
## xxd command that takes standard input or file and outputs it as hexidecimal format
$ xxd /dev/zero
https://linux.die.net/man/1/xxd
xxd is an hex/binary editor (we used it yesterday to look at files)
with the -b option it reads binary, else it is in hex
/dev/zero is virtual device that generates binary zeros (randomly)
Presentation of Anita & Martino
The problem of how do you observe files that are also processes
Different ways to observe the binary:
there are two meanings of binary!
file in/as binary (1) as "raw data" (0 & 1) and hexadecimal
or file as
a
binary (2) executable files
they give directly instructions to the processor
See the content of a file as hex/binary:
$ hexcurse filename # (for example hexcurse /bin/ls )
$ xxd filename #(shows in hexadecimal, for example xxd /bin/ls)
$ xxd -b filename #(shows in binary)
Example ls command
# copy ls file
$ cp /bin/ls ls_command
$ hexcurse ls_command
See executable binaries / commands disassembled (i.e. human readable, f.ex. mov/call/compare-cmp/substract-sub):
it disassembles an object file, makes it readable (human-readable instruction), which is a 1:1 translation of the machine code (almost);
$ objdump -d ls
# f.ex. esp where computer stores in memory in the processor
# 1st column: on which line we start with the instruction
# operation is split into different commands
the commands that in the end are executed are depending on the architecture of processor.
Overwriting the binary (hexcurse allows to write in the binary)
# look for text inside ls
$ ls --help | grep GNU
then search for it in the file open with hexcurse (Ctrl+F GNU)-->replace with martino :P--->make file executable-->run--->Segmentation fault (core dumped)
you can rewrite the file
that is how they broke protections in games before (cfr Luis stories), by just modifying the executable
Other software that does something similar:
http://cloud.radare.org/enyo/ how does this work?
in these instructions logic of where it jumps to, it will show hierarchy of the commands
https://camo.githubusercontent.com/999272d7fc19980bea5566b916ab155796a1aa9f/687474703a2f2f686162726173746f726167652e6f72672f67657470726f2f686162722f706f73745f696d616765732f6438392f3635302f6236622f64383936353062366265303537363662336232396263663633363665356536322e706e67
VISUALISING A FILE WHEN IT IS NOT OPERATING USING TOOL BINVIS
http://etherbox.local/var/www/html/binvis.io/#/
http://binvis.io/#/
Same bytes can be opened as image/sound file...
Why looking into files?
Wanted to look into the executables of the computer, and use visualizing tools to see readable text
if you convert bytes to colour, you can immediately see where is text
you can also recognize patterns that repeat
f.ex. lighter color for higher entropy (less structure): will show you when something is encrypted or not, when it is encrypted it is a mess
when you say 'I wrote a program', it is a file
binvis is a way to visualise that file
[analogy with the punchcard sorter]?
or, it has to do with the difference between movement/static process/file?
(observing somethig static vs 'in movement')
analogy with EEG - brain monitoring electric impulse
difference between where the file is located on harddisk and where it is loaded
we could look for file in memory (RAM)
## Another nice little tool
$ strings ls
shows only the readable parts of the ls command
/dev/mem is file that is representing the memory in action, treated as a file
$ strings /dev/mem
$ strings /dev/mem | grep a
## WISHES FOR NEXT DAYS
- explore /dev/mem: load our own executable, find it back, see what the results are
- make a big drawing!!!
- re-look on the second part on hard drive access, look at inodes, debugfs, hard drive physical locations..
- explore /dev/fb0 : (or framebuffer) use the 'graphic memory' to directly access the pixels on the screen
- same with /dev/dsp or /dev/sound, so observing files through their sonification?
- create bash script that outputs harddrive location of a file (inode, blockgroup, blocks...)
-autopsy of a virtual machine
- a way back to the mechanical/study of software and l
FRIDAY
files, programs-->where is the operational level? how to get closer to the workings, the instruction-being of an executable/executing file
MEMORY
In the early days, the easiest was to dump the memory from the memory device (/dev/mem) but over time the access was more and more restricted in order to avoid malicious process to directly access the kernel memory directly. The kernel option CONFIG_STRICT_DEVMEM was introduced in kernel version 2.6 and upper (2.6.36–2.6.39, 3.0–3.8, 3.8+HEAD). So you'll need to use a Linux kernel module in order to acquire memory.
Lime (http://code.google.com/p/lime-forensics/) is an alternative solution to acquire memory from Linux. Lime supports more recent version of Linux Kernel. As the technique to expose and acquire memory is less intrusive, the forensic acquisition might be more accurate.
Exploring boundaries/tensions
databases treat their content as data (database punctualization)
some exploits manage to include operations in a database
tools/knowledge/power relation--->using some tools-getting an understanding of how things work immediately gives you this feeling of power over...in this case to mess things up
///////
look at QUINES software
aquine
a quine
a/quine
why the theological framework
because we have been stuck and talking suspended between many dualisms
such as: hardware/software, code/data, material/immaterial, giraffe/zebra
but, they are all based on a dualist framework lasting millennias
the quine is interesting cos its a piece of code that outputs itself, therefore self reflexive
but, it exposes also the problem of dualism
why analogy? because we can understrand/expose software either univocaly or equivocaly
so, we must find a poetic-way-of-analysis
like: a worm(soft) that goes through like a worm(butterfly)
the worm, in the end, is a bug (to find the worm you debug)
or: run a software in debug mode so that you see the function, where it is(Gottfried)
poetic: as a certain elogism as used by the russian constructivists
that is to oppose something not connected together, a sort seredipity that binds things together
so: a software that observes software must create moments that bring us to a new meaning
or: we can observe the bebug argument itsefl; is it just printing out? how about the imputs?
-bug mode?
another element that could create this tension:
the idea of software as sequential instruction and the wave of parallel computing
so, can we actually create parallellism into the sequential construction?
could git be considered a tool that have a parallelist structure?
but, gut is maybe just an anlogy of parallel computing
but could we have this tention with multiple parallel processes running from computers sonnected to the same server?
could that be our serendipity?
we need to develop an entire new language, that you cant name.
programming language is an instruction sect: uts like thinking like a machine
so why is called a language at all? so the computer gets anthromorphized and the human gets machinized!
back to ram
/dev/mem
tools to explore processes stored in the memory
ps ax | grep process
cd /proc/numberoftheprocess
cat maps
--> check what it is using
The proc filesystem is a pseudo-filesystem which provides an
interface to kernel data structures. It is commonly mounted at
/proc. Most of it is read-only, but some files allow kernel
variables to be changed.
dump to a file-->change something in the file-->dump new to a file-->diff oldfile newfile
"where am i?"
to find read/write memory addresses of a certain process
awk -F "-| " '$3 ~ /rw/ { print $1 " " $2}' /proc/PID/maps
take the range and drop it to hexdump
sudo dd if=/dev/mem bs=1 skip=$(( 16#b7526000 - 1 )) \
count=$(( 16#b7528000 - 16#7b7526000 + 1)) | hexdump -C
something else to try
https://github.com/anacrolix/archive/blob/master/tibia/butox/memutil.py
convert hex to decimal
echo $((0xb7526000))
serendipity and paranoia!
devmem
fmem module
http://www.forensicswiki.org/wiki/Linux_Memory_Analysis
fmem 1.5.0
This module creates /dev/fmem device,
that can be used for dumping physical memory,
without limits of /dev/mem (1MB/1GB, depending on distribution)
https://www.gnu.org/software/gdb/
dump to processes directly? inject to them?
pan/monopsychism:
(aquinas famously opposed averroes..who's philosophy can be interpreted as monopsychist)
shared memory
copying the same memory to different computers
https://en.wikipedia.org/wiki/Reflection_%28computer_programming%29
it could cut through the memory like a worm
or it could go through the memory of different computers one after the other and take and leave something there
what it does should be something like self-display, performative argument, aesthetics, like quines
but somehow in realtion to its relative context (e.g. where it is is also defined by its neighbours etc)
---------------------------------------------------------------------------------------
so software exists only outside your computer? only in general terms?
checking for the word software in all man pages:
grep -nr software /usr/local/man
!!!!
software appears only in terms of license:
This program is free software
This software is copyright (c)
we don't run software..we still run programs
nevertheless software is everywhere
----------------------------------
SATURDAY
https://linux.die.net/man/1/strace
traces/maps calls and interactions between processes and operating system
"strace runs the specified command until it exits. It intercepts and records the system calls which are called by a process and the signals which are received by a process."
popular system calls are open, read, write, close, wait, exec, fork, exit, and kill
e.g.
strace echo "strace blablabla" >> aquine
it looks something like this:
[.....]stuff
open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=1607712, ...}) = 0
mmap(NULL, 1607712, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fa80d400000
close(3) = 0
fstat(1, {st_mode=S_IFREG|0644, st_size=2884, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa80d5ae000
write(1, "strace blablabla\n", 17) = 17
close(1) = 0
munmap(0x7fa80d5ae000, 4096) = 0
close(2) = 0
exit_group(0) = ?
+++ exited with 0 +++
For the manual page: http://etherbox.local:9001/p/auqinas
------------------------------
LIST OF TOOLS
lsof
list open files
An open file may be a regular file, a directory, a block special file, a character special file, an executing text refer‐
ence, a library, a stream or a network file (Internet socket, NFS file or UNIX domain socket.) A specific file or all the
files in a file system may be selected by path.
but everything is a file in unix!
strings
shows the human readable/printable characters in a file
e.g.
$strings /usr/bin/strings
(compare with "hexcurse strings")
strace
traces/maps calls and interactions between processes and operating system
"strace runs the specified command until it exits. It intercepts and records the system calls which are called by a process and the signals which are received by a process."
popular system calls are open, read, write, close, wait, exec, fork, exit, and kill
e.g.
$strace strace
*slow down the execution of a process
$nice command
(lower priority)
------------------------------
EXERCISES:
framebuffer
Exploring /dev/fb0 (framebuffer)
- use the 'graphic memory' to directly access the pixels on the screen
- same with /dev/dsp or /dev/sound, so observing files through their sonification?
On the framebuffer: https://www.kernel.org/doc/Documentation/fb/framebuffer.txt
Your problem is a file!
For the Intake: Do you have software problems: is it a file? // can you write it into a file???
Possible problems:
forgetting password
(dis)organising files
psychological effects of messy desktops / computer problems
How do we deal with it
- ritualize the process (urban 'magician'/medium practices)
- take inspiration from colour therapy / sound therapy/mindless mindfulness? Agile Yoga?
Treatments:
>complex pixel surgery: identifying the pixel problem, extirpating it, restoring&(eventually returning the file to the owner)
>sound therapy
>regression therapy: renarrating traumatic software experiences through pixel rearrangement
>blood letting: Put everything into a file and delete it (Think about problems recovery system>what happens after the deletion?)
>Send to desktop logistics/cleaning service
Why is it a critique?
- Unix philosophy everything is a file
- challenging the idea of fixing (cause/effect)
- making software problems more tangible through transformation