File Observing exercise

## How to inspect an excecutable program?  
For example the _ls_ program on UNIX systems  
(some of the commands on this page nees to be checked)

```bash
# copy ls as ls_command
$ cp /bin/ls ls_command  
# this will do the same as ls
$ ./ls_command  
```

`$ objdump -d ls_command`

(an object file: code that can be "leaked", to run inside another program)  
it shows the assembly/assembler code, which is the lowest level of code : <https://en.wikipedia.org/wiki/Assembly_language>  
on the left in hexidecimal code/on right human 'readable' language  
1st colum: position in hex code, the address  
2nd column: byte code, basic file  


try: 

```bash
    $ hexcurse ls_command 
    # to display in binary 
    $ hexcurse -b ls_command
```

this shows it in binary 0's and 1's  
translation from binary into hexadecimal and ASCII  
the way we look at content of files is based on 8 bits  

1st column: address  
2nd: binary in 8-bit  
3rd column: the ascii translation of bits in letters  

Intel 80x86 Assembly Language
<http://www.mathemainzel.info/files/x86asmref.html>

OpCodes

different representations of programs, ascii and hex

Finding military forensic tools for observing files.

![binvis.io on the file /bin/ls]( http://etherbox.local/home/pi/images/Screenshot_from_2017-06-07_164407.png )

< http://binvis.io >  
<http://etherbox.local/var/www/html/binvis.io/>  
used for studying viruses, tool was used to find the ransomware, 15 years ago to crack codes. Finding the 'jump'
bites translated to shades of grey/colours instead of letters, in between 00000 is black - 11111 is white  
There is always a jump in the programme where the copyright/license should be inserted, if you replace the 'jump' by 
90 = no operation you can play!  

open your file in Gimp as raw data  

![]( http://etherbox.local/home/pi/images/GIMPOpenFileAsImageData.png )

(It's necessary to select "All files" (default is 'all images') in drop down filter and then select "Raw Image Data" as the File Type)

What is the relationship between observation and translation?
- translations here from binary to ? to color of pixel/frequency/ascii

man ascii 
'every letter' of the alphabet can be represented by a number
![src:  https://en.wikipedia.org/wiki/ASCII#/media/File:US-ASCII_code_chart.png ](http://etherbox.local/home/pi/images/US-ASCII_code_chart.png )

## So ... where is the software?  

for this we can use the command 'dd'
/dev/mem contains your computer memory
dd can have an input and output which is either a file or hardrive/memory. Count is the amount of kilobytes it can ... ?

```bash
$ dd if=/dev/mem of=memory.bin count=10000 skip=132143
```

another exercise:
    to copy a sector of the phyisical memory or virtual memory:
        physical  
        ```bash
        $ sudo dd if=/dev/mem of=memfile.dump bs=1 skip=_startingbyte_ count=_totalbytes_
        ```
        ex for 1mb or ram: 
        ```bash
       $ sudo dd if=/dev/mem of=memfile.dump bs=1 skip=23500 count=1000000
        #virtual same as above but with// sudo dd if=/proc/kcore
        ```

to cut files to a specific byte size:
    ```bash
    $ dd if=mem1000000.dump of=smaller.raw bs=1 count=3000
     ```
will cut the file after the first 3000 bytes.
    ```bash
    $ dd if=mem1000000.dump of=smaller.raw bs=1 skip=4000 count=3000
     ```
    same but starting after the first 4000 bytes

"if you can cat, you can dd" 
```bash
$ sudo debugfs /dev/sda3
```
"in Linux you can't read the kernel memory addresses, they are protected" while the kernel runs  

using bin.visio to visualise the memory  
you can see non used areas, where is text ...  

files are not continuous streams of bytes, but blocks  

## How does a file sit in the harddrive?

### Inode
i-node is an unique number, a location on the harddrive: <https://en.wikipedia.org/wiki/Inode>  
software that links back inodes to files on harddrive: debugfs  
used for recovering data & forensics  
See also: <http://teaching.idallen.com/cst8207/13w/notes/450_file_system.html>  
<https://en.wikipedia.org/wiki/Inode_pointer_structure>  


List inodes of folders ls -i / or files ls -l
 ```bash
$ ls -i testfile
 ```
output example: 8443114 ls_command  

or:  
 ```bash
$ stat ls_command
 ```
example output:   
 ```bash
  File: 'ls_command'
  Size: 126584            Blocks: 248        IO Block: 4096   regular file
Device: 805h/2053d        Inode: 8443114     Links: 1
Access: (0755/-rwxr-xr-x)  Uid: ( 1000/     ana)   Gid: ( 1000/     ana)
Access: 2017-06-10 09:12:16.839835082 +0200
Modify: 2017-06-07 16:08:10.298522614 +0200
Change: 2017-06-07 16:08:10.298522614 +0200
 Birth: -
  ```

This is different to permissions of files:  
 ```bash
$ ls -l name_of_file
#example output:  
    -rwxr-xr-x 1 ana ana 126584 Jun  7 16:08 ls_command  


# See the number of free inodes on a filesystem
# The Unix command "df" will display each partition, and give a percentage of the disk blocks that are used.   
$ df -i

# example output
Filesystem       Inodes   IUsed    IFree IUse% Mounted on
udev            2010563     556  2010007    1% /dev
tmpfs           2015628     801  2014827    1% /run
/dev/sda5      12992512 1049361 11943151    9% /
tmpfs           2015628      11  2015617    1% /dev/shm
tmpfs           2015628       5  2015623    1% /run/lock
tmpfs           2015628      16  2015612    1% /sys/fs/cgroup
tmpfs           2015628      33  2015595    1% /run/user/1000  

# Have a look at your file system (divisions harddisk, types etc)
$ df -T
# output example:
Filesystem     Type     1K-blocks      Used Available Use% Mounted on
udev           devtmpfs   8042252         0   8042252   0% /dev
tmpfs          tmpfs      1612504      9676   1602828   1% /run
/dev/sda5      ext4     204416808 156790276  37219672  81% /
tmpfs          tmpfs      8062512       244   8062268   1% /dev/shm
tmpfs          tmpfs         5120         4      5116   1% /run/lock
tmpfs          tmpfs      8062512         0   8062512   0% /sys/fs/cgroup
tmpfs          tmpfs      1612504        72   1612432   1% /run/user/1000  


# Start forensic/debugging program: 
$ sudo debugfs /dev/sda5 (can be sda3)
$ stat <inode_of_file>

or:
$ debugfs
$ stat <number_of_file>
```
this gives the block in which file is saved in your harddrive (fill it in the skip -2)

find your file system (types etc)
```bash
df -T
sudo debugfs /dev/sda3

# Example output:  
Inode: 8443114   Type: regular    Mode:  0755   Flags: 0x80000
Generation: 1834475523    Version: 0x00000000:00000001
User:  1000   Group:  1000   Size: 126584
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 248
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x593808ca:472c5fd8 -- Wed Jun  7 16:08:10 2017
 atime: 0x593b9bd0:c83b7728 -- Sat Jun 10 09:12:16 2017
 mtime: 0x593808ca:472c5fd8 -- Wed Jun  7 16:08:10 2017
crtime: 0x593808ca:472c5fd8 -- Wed Jun  7 16:08:10 2017
Size of extra inode fields: 32
EXTENTS:
(0-30):34304529-34304559  


# or shorter:
$ sudo debugfs -R "stat <inode_number>" /dev/sda5
# Example output: 
Inode: 8443114   Type: regular    Mode:  0755   Flags: 0x80000
Generation: 1834475523    Version: 0x00000000:00000001
User:  1000   Group:  1000   Size: 126584
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 248
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x593808ca:472c5fd8 -- Wed Jun  7 16:08:10 2017
 atime: 0x593b9bd0:c83b7728 -- Sat Jun 10 09:12:16 2017
 mtime: 0x593808ca:472c5fd8 -- Wed Jun  7 16:08:10 2017
crtime: 0x593808ca:472c5fd8 -- Wed Jun  7 16:08:10 2017
Size of extra inode fields: 32
EXTENTS:
(0-30):34304529-34304559  
```


this is making me think about the different kinds of/definitions of abstraction
```bash
$ sudo dumpe2fs -h /dev/sda3  to find out block size
$ sudo dd if=/dev/sda3 of=newfile.txt bs=1 count=10 skip=209161134080
```
blocksize: how many blocks it needs to jump in order to get to our file--

Goal: how does a file sit in a hardrive?  
when you remove a file, you only remove the inode,the file sits there until it is overwritten, you 'zero' it, command 'shred' makes it find back always  
this happens because it takes less energy to not overwrite it with zero's  

defragmenting: reorganises your harddisk so that empty spaces are brought togehter, and it takes less time to go over the index  

organization/configuration and reconfiguration of the arrangement of a given material's potential states/state potentials   

## Back to the question we started with
The difference between a file and software?  
The difference between non-executable and executable files  
Perhaps first step is permission (-x)  

COM vs EXE files. The EXE has a header with metadata about it's contents. While an COM is "headerless" and thus more open to modification as the system will still attempt to execute.   


Any binary can be run. So when a file is corrupt that means that it cannot be rendered/read/made visible because something in the operating system's reading tells it not to finish/present  

In essence there is no difference  
Why is it necessary to make the distinction?  

_everything is a file (unix)_  
so: no difference between file and program and data  
does it matter if something is a letter, character, ...  
it has consequences for who/what can read  

if it is executable, is it software?  

if it is noise, it is still 'executed'?  
It depends if it is gibberish or something that the computer can understand as a command  

anything that can run is software?  
(what does it mean to 'run')  

storing software on a tape ... you can play it as an audiofile?  


an audio file is fundamentally different from python, 
it has some tolerance, you can play with it filter it  
and python is different  
ascii also not  
you cannot interpret a perl script as a python code  
you can frame data in different ways  
-> we're touching the border of software here? is it not more about 'tolerance' and 'noise'?  
software is language. Ascii is not language  
data vs execution of data  
asccii is not language, it is a convention to transform letters into bits  
There is a difference between data and an executing of data, they have their own realities  

a thought experiment:
* can we write a software on a casette tape in a way that it can play music
* i take your python script and put it on a cassette and play it
* is it still a program now that it is stored and it is just sound it is not a script anymore?
encoding/decoding and tolerance ... translation! (can you get it back to a form that it can 'run')  
but you can   

the permission of natural language vs interrupting a software  

a difference between conversion/translation and execution  
(Thinking the machine that translates a punchcard into printed form... is it "executing" it ... well not exactly, it's just translating from punchcard form to printed symbol form)  

we are dissecting file systems  
i would suggest writing a file system that has the side effect of being playable  
i don't see a difference between the two  
we have no way to put a hard drive into a player  
but with the cassette tapes that works  

### A story on conversions
Pirate Broadcast: Making money out of cracking games. The code would be broadcasted over FM, and everyone would record at 12:00 the code on their tapes. This tape could then be put in the commodore. Is this data? It's software-as-a-service! 
What kind of games? Commodore 64 games  

Layers of abstraction, where each have different tolerances / expectations / NOISE  


Independence day of software  
IBM budled their hardware and software  
then there came a birth of software, a legal rule that says you cannot combine the two anymore  

multiple births of software: <http://etherbox.local/home/pi/books/9783957960566-No-Software-just-Services.pdf#page=21>

use of data as a way to optimize service cfr example of offering login service to websites / register users' time of logins, scan for anomalities, resell a service of alarm when a user signs in on irregular moment

"Software as configuration"

turing already said: data = programme
so why try to separate?
In Big Data software + data are inseparable.

abstraction ... different usages of that term today
abstracting away from its political / economical reality

Suggestion to create more a field of values, avoid a duality (data/program) ... When did the line start to get blurry?

## wrap up
This excercise has put the focus on the location of software, where it sits in the harddrive, instead of focussing on what it does

It would be a good idea to sketch lines of the terminologies, and see where these moments appear.

NEXT DAY 8-6 REHEARSAL
how to all share terminal

1. in the observed terminal, just type
$ screen

2. for the observers:
## log into a machine
$ ssh observatory@10.9.8.225
pass: techno
$ screen -rx
$ echo hello

START
## file that generates constant zeros, random set of number
## xxd command that takes standard input or file and outputs it as hexidecimal format
$ xxd /dev/zero

https://linux.die.net/man/1/xxd
xxd is an hex/binary editor (we used it yesterday to look at files)
with the -b option it reads binary, else it is in hex 
/dev/zero is virtual device that generates binary zeros (randomly)

Presentation of Anita & Martino 

The problem of how do you observe files that are also processes
Different ways to observe the binary:  
    there are two meanings of binary!
file in/as  binary (1) as "raw data" (0 & 1) and hexadecimal
or file as a binary (2) executable files 
they give directly instructions to the processor

See the content of a file as hex/binary:
    $ hexcurse filename # (for example hexcurse /bin/ls )
    $ xxd filename #(shows in hexadecimal, for example xxd /bin/ls)
    $ xxd -b filename #(shows in binary)

Example ls command
# copy ls file
$ cp /bin/ls ls_command
$ hexcurse ls_command

See executable binaries / commands disassembled (i.e. human readable, f.ex. mov/call/compare-cmp/substract-sub):
it disassembles an object file, makes it readable (human-readable instruction), which is a 1:1 translation of the machine code (almost); 
$ objdump -d ls

# f.ex. esp where computer stores in memory in the processor
# 1st column: on which line we start with the instruction
# operation is split into different commands
the commands that in the end are executed are depending on the architecture of processor.

Overwriting the binary (hexcurse allows to write in the binary)

# look for text inside ls
$ ls --help | grep GNU

then search for it in the file open with hexcurse (Ctrl+F GNU)-->replace with martino :P--->make file executable-->run--->Segmentation fault (core dumped)

you can rewrite the file
that is how they broke protections in games before (cfr Luis stories), by just modifying the executable

Other software that does something similar:
http://cloud.radare.org/enyo/  how does this work?
in these instructions logic of where it jumps to, it will show hierarchy of the commands
https://camo.githubusercontent.com/999272d7fc19980bea5566b916ab155796a1aa9f/687474703a2f2f686162726173746f726167652e6f72672f67657470726f2f686162722f706f73745f696d616765732f6438392f3635302f6236622f64383936353062366265303537363662336232396263663633363665356536322e706e67

VISUALISING A FILE WHEN IT IS NOT OPERATING USING TOOL BINVIS
http://etherbox.local/var/www/html/binvis.io/#/
http://binvis.io/#/
Same bytes can be opened as image/sound file...
Why looking into files?
Wanted to look into the executables of the computer, and use visualizing tools to see readable text
if you convert bytes to colour, you can immediately see where is text
you can also recognize patterns that repeat
f.ex. lighter color for higher entropy (less structure): will show you when something is encrypted or not, when it is encrypted it is a mess

when you say 'I wrote a program', it is a file
binvis is a way to visualise that file

[analogy with the punchcard sorter]?
or, it has to do with the difference between movement/static process/file?
(observing somethig static vs 'in movement')
analogy with EEG - brain monitoring electric impulse

difference between where the file is located on harddisk and where it is loaded
we could look for file in memory (RAM)

## Another nice little tool
$ strings ls
shows only the readable parts of the ls command
/dev/mem is file that is representing the memory in action, treated as a file
$ strings /dev/mem
$ strings /dev/mem | grep a



## WISHES FOR NEXT DAYS
- explore  /dev/mem: load our own executable, find it back, see what the results are
- make a big drawing!!!
- re-look on the second part on hard drive access, look at inodes, debugfs, hard drive physical locations..
- explore /dev/fb0 : (or framebuffer) use the 'graphic memory' to directly access the pixels on the screen
- same with /dev/dsp or /dev/sound, so observing files through their sonification?
- create bash script that outputs harddrive location of a file (inode, blockgroup, blocks...)
-autopsy of a virtual machine

- a way back to the mechanical/study of software and l



FRIDAY

files, programs-->where is the operational level? how to get closer to the workings, the instruction-being of an executable/executing file

MEMORY

In the early days, the easiest was to dump the memory from the memory device (/dev/mem) but over time the access was more and more restricted in order to avoid malicious process to directly access the kernel memory directly. The kernel option CONFIG_STRICT_DEVMEM was introduced in kernel version 2.6 and upper (2.6.36–2.6.39, 3.0–3.8, 3.8+HEAD). So you'll need to use a Linux kernel module in order to acquire memory.

Lime (http://code.google.com/p/lime-forensics/) is an alternative solution to acquire memory from Linux. Lime supports more recent version of Linux Kernel. As the technique to expose and acquire memory is less intrusive, the forensic acquisition might be more accurate.




Exploring boundaries/tensions

databases treat their content as data (database punctualization)
some exploits manage to include operations in a database


tools/knowledge/power relation--->using some tools-getting an understanding of how things work immediately gives you this feeling of power over...in this case to mess things up


///////
look at QUINES software

aquine

a quine

a/quine


why the theological framework 
because we have been stuck and talking suspended between many dualisms
such as: hardware/software, code/data, material/immaterial, giraffe/zebra
but, they are all based on a dualist framework lasting millennias
the quine is interesting cos its a piece of code that outputs itself, therefore self reflexive
but, it exposes also the problem of dualism
why analogy? because we can understrand/expose software either univocaly or equivocaly

so, we must find a poetic-way-of-analysis
like: a worm(soft) that goes through like a worm(butterfly)
the worm, in the end, is a bug (to find the worm you debug)
or: run a software in debug mode so that you see the function, where it is(Gottfried)
poetic: as a certain elogism as used by the russian constructivists
that is to oppose something not connected together, a sort seredipity that binds things together
so: a software that observes software must create moments that bring us to a new meaning
or: we can observe the bebug argument itsefl; is it just printing out? how about the imputs?

-bug mode?

another element that could create this tension:
the idea of software as sequential instruction and the wave of parallel computing
so, can we actually create parallellism into the sequential construction?
could git be considered a tool that have a parallelist structure?
but, gut is maybe just an anlogy of parallel computing
but could we have this tention with multiple parallel processes running from computers sonnected to the same server?
could that be our serendipity?
we need to develop an entire new language, that you cant name. 
programming language is an instruction sect: uts like thinking like a machine
so why is called a language at all? so the computer gets anthromorphized and the human gets machinized!

back to ram

/dev/mem
tools to explore processes stored in the memory

ps ax | grep process
cd /proc/numberoftheprocess
cat maps
--> check what it is using

The proc filesystem is a pseudo-filesystem which provides an
       interface to kernel data structures.  It is commonly mounted at
/proc.  Most of it is read-only, but some files allow kernel
       variables to be changed.

dump to a file-->change something in the file-->dump new to a file-->diff oldfile newfile

"where am i?"


to find read/write memory addresses of a certain process
awk -F "-| " '$3 ~ /rw/ { print $1 " " $2}' /proc/PID/maps

take the range and drop it to hexdump
sudo dd if=/dev/mem bs=1 skip=$(( 16#b7526000 - 1 )) \
        count=$(( 16#b7528000 - 16#7b7526000 + 1)) | hexdump -C

        something else to try
https://github.com/anacrolix/archive/blob/master/tibia/butox/memutil.py

convert hex to decimal
echo $((0xb7526000))

serendipity and paranoia!


devmem
fmem module
http://www.forensicswiki.org/wiki/Linux_Memory_Analysis

fmem 1.5.0
This module creates /dev/fmem device,
that can be used for dumping physical memory,
without limits of /dev/mem (1MB/1GB, depending on distribution)

https://www.gnu.org/software/gdb/
dump to processes directly? inject to them?

pan/monopsychism:
(aquinas famously opposed averroes..who's philosophy can be interpreted as monopsychist)

shared memory

copying the same memory to different computers

https://en.wikipedia.org/wiki/Reflection_%28computer_programming%29


it could cut through the memory like a worm

or it could go through the memory of different computers one after the other and take and leave something there

what it does should be something like self-display, performative argument, aesthetics, like quines
but somehow in realtion to its relative context (e.g. where it is is also defined by its neighbours etc)

---------------------------------------------------------------------------------------
so software exists only outside your computer? only in general terms?
checking for the word software in all man pages:

grep -nr software /usr/local/man
!!!!

software appears only in terms of license: 

This program is free software
This software is copyright (c)

we don't run software..we still run programs
nevertheless software is everywhere

----------------------------------
SATURDAY


https://linux.die.net/man/1/strace
traces/maps calls and interactions between processes and operating system
"strace runs the specified command until it exits. It intercepts and records the system calls which are called by a process and the signals which are received by a process."
 popular system calls are open, read, write, close, wait, exec, fork, exit, and kill

 e.g.
 strace echo "strace blablabla" >> aquine
 it looks something like this:

 [.....]stuff
 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=1607712, ...}) = 0
mmap(NULL, 1607712, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fa80d400000
close(3)                                = 0
fstat(1, {st_mode=S_IFREG|0644, st_size=2884, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa80d5ae000
write(1, "strace blablabla\n", 17)      = 17
close(1)                                = 0
munmap(0x7fa80d5ae000, 4096)            = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

For the manual page: http://etherbox.local:9001/p/auqinas

------------------------------
LIST OF TOOLS

lsof
list open files
An open file may be a regular file, a directory, a block special file, a character special file, an executing  text  refer‐
       ence,  a library, a stream or a network file (Internet socket, NFS file or UNIX domain socket.)  A specific file or all the
       files in a file system may be selected by path.
but everything is a file in unix!

strings 
shows the human readable/printable characters in a file
e.g.
$strings /usr/bin/strings

(compare with "hexcurse strings")

strace
traces/maps calls and interactions between processes and operating system
"strace runs the specified command until it exits. It intercepts and records the system calls which are called by a process and the signals which are received by a process."
 popular system calls are open, read, write, close, wait, exec, fork, exit, and kill
 e.g.
 $strace strace

*slow down the execution of a process 
$nice command
(lower priority)

------------------------------
EXERCISES:

framebuffer
Exploring /dev/fb0 (framebuffer)
- use the 'graphic memory' to directly access the pixels on the screen
- same with /dev/dsp or /dev/sound, so observing files through their sonification?
On the framebuffer: https://www.kernel.org/doc/Documentation/fb/framebuffer.txt


Your problem is a file!
For the Intake: Do you have software problems: is it a file? // can you write it into a file???

Possible problems:
    forgetting password
    (dis)organising files
    psychological effects of messy desktops / computer problems

How do we deal with it
- ritualize the process (urban 'magician'/medium practices)
- take inspiration from colour therapy / sound therapy/mindless mindfulness? Agile Yoga?

Treatments:
>complex pixel surgery: identifying the pixel problem, extirpating it, restoring&(eventually returning the file to the owner)
>sound therapy
>regression therapy: renarrating traumatic software experiences through pixel rearrangement
>blood letting: Put everything into a file and delete it (Think about problems recovery system>what happens after the deletion?)
>Send to desktop logistics/cleaning service 

Why is it a critique?
- Unix philosophy everything is a file
- challenging the idea of fixing (cause/effect)
- making software problems more tangible through transformation