new
_new
is_valid
_run_preprocess
_build_exclusions
version
_set_timezone
_verify
get_timezone_list
start
get_out_footer
_parse_dir
_parse_file
_process_timestamp
_open_file
_close_file
_input_exists
_output_exists
get_timezone
_load_input_list
_load_input_module
_load_input
_load_output
_get_module_help
_format_sort
_calc_offset
Log2Timeline - The main engine of log2timeline and the API to interface with.
This is the main engine of the tool log2timeline. This file or engine serves as the communicator between different parts of the tool. This is the API that the front-end talks to, and the engine that iniates both the input and output modules as well as to control the flow of them.
So this is the bread and butter of log2timeline so to speak and the library that can be imported into any tool that wishes to implement a front-end for the tool. And this documentation should serve as a guideline into how to use the API. If the intention is to develop a new front-end or a tool that interacts with the engine, either consult this manual, the tool's wiki ( https://code.google.com/p/log2timeline/) or examine the example front-end found inside the dev/ folder from the source tarball.
use constant FALSE => 0; use constant TRUE => 1;
# create a new instance of log2timeline my $l = Log2Timeline->new( 'file' => '.', 'recursive' => FALSE, 'input' => 'all', 'output' => 'csv', 'time_zone' => 'local', 'offset' => FALSE, 'exclusions' => '', 'text' => '', 'debug' => FALSE, 'digest' => FALSE, 'quick' => FALSE, 'raw' => FALSE, 'hostname' => '', 'preprocess' => 0, ); # check if there is a new version available print $l->check_upgrade;
# get the current version number of the tool print $l->version; # get the help text from an input module print $l->get_help_in( 'recycler' );
# get the help text from an output module print $l->get_help_out( 'csv' );
# change some of the tools settings $l->set( 'recursive' => 'yes' );
# get a list of all the available input modules and lists $l->get_inputs;
# get a list of all the available output modules $l->get_outputs;
# start parsing through the files, gathering timestamps $l->start;
# get a list of all the available timezones $l->get_timezone_list;
# get the currently set timezone $l->get_timezone;
This documentation contains a list and description of both public and private methods. All private methods start with an underscore character (_) and should not be used or called by front-ends or other tools interacting with the API.
All other methods (excluding private) are considered to be part of the API and can be used or called by various front-ends.
new
The constructor, a very simple one, just returns the value of the secondary constructor. When a new Log2Timeline object is created it can be created without any parameters, the tool will simply accept them all as the default value.
IF however, there is a need to overwrite some of the default behavior of the tool, such as to instruct it to parse another file than the current directory using the local timezone of the machine, etc. then there are two options. Either to use the parameters to the constructor to define the options or to use the set sub routine to change a value of a parameter.
Since this constructor only calls a secondary one, please refer to the description of the _new
to get a more detailed description of what is done in this phase.
_new
The constructor of the tool does not really do anything except to call the private constructor (this sub routine) and return the value that this sub routine returns.
This routine takes the hash value that is passed to the constructor as a parameter and sets up all the values of the needed variables in the tool. There are some values that need to be set for the tool to properly operate, such as the path to the file/directory that needs to be parsed, the name(s)/list of input modules to load up, time zone of the image and the output, etc.
This routine takes care of assigning values to each of these variables. It compares the hash that is sent to the routine as a parameter and checks if it recognizes the variable. If it does it will assign the value that is passed to it, otherwise it will assign it to the default value that is hardcoded into this sub routine. This means that no values need to be sent to the engine in order for it to work, it is only necessary to define those that need to be changed from the default values (listed below).
The variables that the sub routine recognizes are (default values inside brackets []):
The path to the directory/file to be parsed/examined.
Boolean value (0/1) that indicates if we should use recursive through a mount point/directory [0/FALSE]
String, containg a list of all input modules (comma separated) that should be loaded. Names can be either a name of a module or a list file, and it can also be negated with a - sign, indicating that module should be omitted from being loaded. [all]
String containing the name of the output module used for output. [csv]
String containing the time zone of the image/file that needs to be parsed. The string can be of any value that the DateTime library supports with the addition of 'local' and 'list'. 'local' will use the local timezone of the computer the tool is run from and 'list' will make the engine build a list of all available time zones and print them out. [local]
String containing the time zone that is printed in the output. If the investigator would like all the output in the same time zone, irrelevant of the input time zone then that can be defined here. [defaults to the same value as time_zone]
The time on any given computer can be vastly different from a correct clock, which is essential to correct if the offset to the real clock is known and timestamps from more than one system are being correlated. This option provides a way to do that. This is a string value or an integer value. If it is an int, then it represents the number of seconds the clock differs (can be prepended with a - sign indicating a negative difference). It can also be a string of the form (regular expression) "^-?\d+[hms]?$", whereas 1h means exactly one hour difference, (h = hour, m = minute, s = second).
A string containg a list of exclusions (comma separated). Sometimes the tool does fail (ohh yes that has actually happened) and fixing that bug is not trivial/done in time/no time to wait. Or that you simply do not want to include certain files in the timeline then this list can be used. It is a comma separated list of strings that are used in regular expressions for exclusions (so do not put something like 'a' in there since that will exclude all files that have the character a somewhere in the path). [empty]
A string that will be prepended to every path in the output. If the tool is run with the text variable set to 'C:' that text will be prepended to every path printed out in the tool. [empty]
A string that contains the path to a temporary directory. The tool sometimes needs to write files to a temporary directory, this occurs for instance when dealing with locked SQLite databases and possible other scenarios. Therefore the tool needs ready access to a temporary directory where it can write data. Different OS's have their default directories, such as the /tmp one in *NIX. The tool does attempt to detect this directory, but for various reasons it may be desired to overwrite the location of it. ['']
An integer indicating the debug level of the tool. There are currently three level observed:
0 = no debug
1 = debug information turned on.
2 = excessive debug information turned on.
A boolean (0/1) that indicates whether or not we should calculate a MD5 hash for every file to include as an attribute. N.b. this increases the time it takes the tool to complete by considerable amount. [FALSE]
Boolean value (0/1). One of the bottlenecks of this tool are the verification of each and every file passed to the tool, making the verification process extremely important to be quick and accurate. However, sometimes the tests that are made might be too slow/accurate and in order to make it possible to create less accurate yet quicker test this option is available. Some input modules (although not nearly all of them) may support this option that skips the more detailed tests and accepts more rudementary validation that a file is what it says it is. [FALSE]
The file that the tool writes it's output to. [STDOUT]
A boolean (0/1) that flags whether or not the tool uses the output mechanism that the output modules provide. If this is set to false the tool will operate as usual, but if true the tool will return the RAW timestamp object instead of a formatted one, as is done in the case of an output module being used. [FALSE]
A boolean value (0/1) that indicates if we want to append to the output file or to overwrite it. [FALSE]
Boolean (0/1) This is a bit of a misnamer. However, some input modules to tend to give excessive details in its message/description and even provide additional timetamps that may or may not be pertinent in every case. This option was added to the tool so that these perhaps too verbose messages/details wouldn't be introduced into the tool unless wanted/needed. This means that $FN timestamps are skipped in the $MFT module, loaded drivers are not printed in the prefetch one, etc. [FALSE]
A string that contains the hostname of the image/host the files are extracted from. Some input modules have the capability to extract the hostname, as does some pre-processors. This variable can however be set to override that and to make sure the hostname is printed on every event. [unknown]
A boolean (0/1) that defines if we should run pre-processors before the start of the run. [FALSE]
When all values have been assigned the routine will go over each assigned variable and call a verification routine on them to verify that the variable is valid and that the supplied value of it is also valid.
When all this is done the routine will assign some other values that are used by the module, such as the OS of the computer using the tool, etc. It will also assign the value of 1 (TRUE) to the variable is_valid, indicating that we have properly set up the module and that this instance is a valid instance of Log2Timeline.
is_valid
A simple subroutine that checks if the variable valid is set or not
_run_preprocess
A private method (not part of the public API).
The notion of a pre-processor is something that is run prior to the real execution of the tool in order to collect information from the image. Each pre-processor can then either choose to simply output the result of this finding or save it in the class variable that can then be used by other input/output modules to give more context around events.
This routine starts by finding all the available modules that are available in the pre-processing directory. Then it will run each one of those to gather the necessary information and update the settings of the tool.
_build_exclusions
A private method (not part of the public API).
A simple routine that examines the exclusion list passed to the tool and converts it into a hash The exclusion list is a string list, separated with commas (,), containing file names or parts of filenames that should be excluded from the recursive scanner.
The routine simply reads the class variable 'exclusions' and builds a hash called 'exclude_list' that contains all the patterns found in the exclusion list.
version
A very simple routine that only returns the current version of the tool.
_set_timezone
A private method (not part of the public API).
This routine is used to check if a string representing a timezone is a valid timezone that is accepted by the DateTime library.
Since there are potentially two time zones defined by the end user, both the one of the suspect system/files and of the desired output there is a switch indicating which one we are testing. There is no difference between the two tests, this switch was simply introduced to make debugging information more concise, that is the difference is simply in the text used in the debug dialog.
The test that is performed is simply to load the DateTime library with the supplied timezone. If it successfully loads up it is considered to be a valid timezone string.
If the timezone that was selected is 'local' then the extracted name of said timezone is pulled from the DateTime object created (all the 'local' magic occurs within the DateTime library).
It is possible to define a long name for the timezone (e.g 'Australia/Sydney') so the DateTime library is checked to see if there is a short name for that particular timezone, and if so that is also returned (and used in the output instead of the longer one).
_verify
A private method (not part of the public API).
Since we are accepting values from the user of the tool, or from a front-end that cannot be trusted we need to validate that each attribute is correctly formed. This is not just as an attempt to verify user inputted data for security purposes, this is also put here to prevent the tool from crashing in later stages due to a bug in one of the parameters.
For each attribute/parameter of the tool that can be defined through the API it's value has to be validated. An attribute is not assigned a value unless this validation returns a true value.
The validation can be very simple, or comprehensive, depending on several factors (one being not completing the implementation).
The routine has a list of all accepted attributes, and if one is passed to the tool that the validation routine does not recognize it is deemed as an invalid attribute and therefore not saved/assigned.
get_timezone_list
This is a simple sub routine that pulls out all names of supported timezones of the DateTime library and puts them in a list.
Ithen sorts that list alphabetically and surrounds it with a banner that gets returned for output.
start
This is one of the main sub routines, the glue that holds all together. When all values have been assigned to the module and processing can be started this is the routine that starts it all.
The routine starts by checking if it should do a recursive search or simply look at a single file.
It will then invoke various internal/protected sub routines that verify and load up needed functionality. Examples of the magic that occurs in this routine are; initiating pre-processing, loading input and output modules, figuring out what the temporary directory is, calculating the clock offset, assigning timezones, building exclusions.
When all that preparation is done the routine will either call a function to parse the file or initiate the recursive scan of a mount point/directory.
get_out_footer
This subroutine can be called to retrieve the footer of a output file.
This is designed for a front-end to be able to append to an output file even though the output file has a footer.
The problem this routine tries to solve is that if a file has already been created to store the timestamp and that particular format contains a footer, simply appending to it will not cut it. That will brake the format.
The purpose of this routine is to invoke the desired output module and retrieve the footer that it will output and return that to the front-end that then can remove the footer from the previous file before starting to output new data.
_parse_dir
A private method (not part of the public API).
This is the recursive method/routine/scanner of the engine. When the tool encounters a directory and it is in a recursive mode it will use this recursive method to go through every possible file in the supplied directory, and if it stumples upon a directory it will call itself with that directory as the root (and thus a recursive method is born).
It is here that the exclusion list is honored. For each file/directory that is found within the supplied directory the path is compared to the entries found inside the exclusion list. If a match is found, that particular file is not tested.
The logic in this method is simple:
List up all files within the supplied directory.
Done.
_parse_file
A private method (not part of the public API).
This sub routine reads the class variable 'file' that contains the path name to a file that needs to be parsed. It accepts only a single string to either a file or a directory.
It will then test to see if this is a file, and then open it or a directory and then open that.
Then the routine will go over each input module that has been loaded up in the tool and attempt to parse the file/directory using that module. It will provide the input module with the necessary information it needs, such as:
fh: A filehandle to the file that needs to be parsed.
name: Name of the file (the full path).
It will then attempt to verify the module can parse the file, and if it successfully validates it will check to see if the module returns a single timestamp object or a one object per line read (variable 'multi-line').
The routine will then either collect that single container and go over each entry therein or call the get_time until all timestamp objects have been collected.
For each output object gathered the tool will check if it should return the raw object (as defined in the 'raw' class variable) or process the output with an output module.
When the routine has completed parsing the file it will close it.
_process_timestamp
A private method (not part of the public API).
After each timestamp object has been extracted from the parsing engine it is passed through this sub routine.
What is essentially does is processing of the timestamp object. It adds some values into it and fixes/adjusts others.
For instance, this routine replaces all backslashes with forward slashed in the description field, it adds the text description field passes as an argument to the timestamp object, it includes information about the directory the tool was called from, includes hostname, etc.
The sub routine also injects other values into the timestamp object, such as the inode value of the file and if the calculate parameter is used it calculates a MD5 sum for all the files passed to the tool and adds that to the timestamp object.
Finally this routine is also responsible for adjusting the timestamps according to the value of the time offset passed as a parameter. That is if we need to adjust all the timestamps because of a faulty clock on the system, that time difference is added or subtracted from the timestamp in this routine.
_process_timestamp($$)
{
my $self = shift;
my $t_line = shift;return 0 unless defined $t_line->{'desc'}; return 0 if $t_line->{'desc'} eq '';
# fix the \ vs. / problem in the output $t_line->{'desc'} =~ s/\\/\//g; $t_line->{'short'} =~ s/\\/\//g;
if (defined $self->{'text'} and $self->{'text'} ne '') { $t_line->{'extra'}->{'path'} = $self->{'text'}; }
# add information about the directory passed on to the tool $t_line->{'extra'}->{'parse_dir'} = $self->{'file_orig'} if $self->{'recursive'};
# default value of self->hostname is unknown if ($self->{'hostname'} ne 'unknown') {
# we have a user supplied hostname, use that and overwrite what ever is in this field $t_line->{'extra'}->{'host'} = $self->{'hostname'} unless defined $t_line->{'extra'}->{'host'}; } else { # use the default one of 'unknown' unless it is already assigned in the input module $t_line->{'extra'}->{'host'} = 'unknown' unless defined $t_line->{'extra'}->{'host'}; }
# add the filename to the t_line $t_line->{'extra'}->{'filename'} = $self->{'file'} unless defined $t_line->{'extra'}->{'filename'}; $t_line->{'extra'}->{'format'} = $self->{'cur_in'} unless defined $t_line->{'extra'}->{'format'};
# check the inode value (and fix it if is set to zero) $t_line->{'extra'}->{'inode'} = (stat($self->{'file'}))[1] unless defined $t_line->{'extra'}->{'inode'};
# fix the time settings (using time_offset) foreach (keys %{ $t_line->{'time'} }) { next unless defined $t_line->{'time'}->{$_}->{'value'};
$t_line->{'time'}->{$_}->{'value'} += $self->{'offset'}; }
# check to see if we are to calculate MD5 sum of the file if ($self->{'digest'}) { print 'File: ' . $self->{'file'} . ' and diggest: ' . $self->{'digest'}; # check if we've already calculated the md5 for this file if (defined($self->{'digest_list'}->{$self->{'file'}})) { $t_line->{'extra'}->{'md5'} = $self->{'digest_list'}->{$self->{'file'}}; } else { # calculate the MD5 sum open(TF, '<' . $self->{'file'}); my $sum = Digest::MD5->new; $sum->addfile(*TF);
# assign the variables $self->{'digest_list'}->{$self->{'file'}} = $sum->hexdigest; $t_line->{'extra'}->{'md5'} = $sum->hexdigest; close(TF); }
}
return 1; }
_open_file
A private method (not part of the public API).
A simple sub routine that is responsible for opening up a file and assigning the filehandle to the $self->{'fh'} variable that is used in the tool.
_close_file
A private method (not part of the public API).
A simple sub routine that has only one task, and that is to close the open filehandle.
A private method (not part of the public API).
A simple sub routine that opens up a directory and gives back an open handle to that directory.
No arguments passed to it nor returned (only uses $self)
A private method (not part of the public API).
A simple sub routine that closes a filehandle to a directory.
No arguments passed to it nor returned (only uses $self)
_input_exists
A private method (not part of the public API).
Determine if an input module exists or not. This sub routine takes as an input a list of input modules. This list can consist of a single input module, a single reference to a list file, or it may be a more complex list containing both modules, lists and negative modules (list of modules that should be excluded).
_input_exists()
{
my $self = shift;
my $in = shift;
my $ret = 0; # the default return value
my ($a, $b);# we can be guessing.. so check out if that's the case return 0 if $in eq 'all';
# the list might contain a minus sign, let's remove them all $in =~ s/-//g;
# we might be using several modules my @s = split(/,/, $in);
# go over each one (only done once if just one is passed on) foreach (@s) { # we do not need further checking if $ret is 0 next if $ret;
# set the default values $a = 0; $b = 0;
# check if we are about to use a list file $a = 1 if -f $self->{'lib_dir'} . $self->{'sep'} . 'Log2t' . $self->{'sep'} . 'input' . $self->{'sep'} . $_ . '.lst';
# or we are using a single input module $b = 1 if -f $self->{'lib_dir'} . $self->{'sep'} . 'Log2t' . $self->{'sep'} . 'input' . $self->{'sep'} . $_ . '.pm';
# either a or b needs to be true $ret = $_ unless ($a or $b); }
return $ret; }
_output_exists
A private method (not part of the public API).
Check for the existance of the output module. That is this routine checks to see if the output module chosen by the user exists or not.
_output_exists()
{
my $self = shift;
my $out = shift;print STDERR "[LOG2T] Testing the existence of ", $self->{'lib_dir'} . $self->{'sep'} . 'Log2t' . $self->{'sep'} . 'output' . $self->{'sep'} . $out . ".pm\n" if $self->{'debug'};
return 1 if -f $self->{'lib_dir'} . $self->{'sep'} . 'Log2t' . $self->{'sep'} . 'output' . $self->{'sep'} . $out . '.pm'; print STDERR "SEPERATOR [" . $self->{'sep'} . "]\n"; print STDERR $self->{'lib_dir'} . $self->{'sep'} . 'Log2t' . $self->{'sep'} . 'output' . $self->{'sep'} . $out . '.pm' . "\n"; return 0; }
get_timezone
A small sub subroutine that simply returns back the value of the current timezone used by the tool.
_load_input_list
A private method (not part of the public API).
A routine that opens up a list file reads in all it's content and stores it in the input_list class attribute.
If a list file is passed to the tool as a parameter this routine will use that to open up the file and read it in, line-by-line to get a list of all the input module it should load (the input lists contain a list of modules they want to use.
_load_input_list()
{
my $self = shift;
my $list = shift;# the variable can either by a user supplied list (comma separated) or a file called INPUT.lst which lists the input modules that are to be used # we are reading from a file if( ! -f $self->{'lib_dir'} . $self->{'sep'} . 'Log2t' . $self->{'sep'} . 'input' . $self->{'sep'} . $list . '.lst') { return 0; } open(LSTFILE, $self->{'lib_dir'} . $self->{'sep'} . 'Log2t' . $self->{'sep'} . 'input' . $self->{'sep'} . $list . '.lst' ); while (<LSTFILE>) { s/\n//; $self->{'input_list'}->{$_}++; }
close(LSTFILE); }
_load_input_module
A private method (not part of the public API).
# either remove or add a module to the input list
=cut
sub _load_input_module()
{
my $self = shift;
my $mod = shift;
# check for the first letter (if it is - then we use all except the ones listed if (substr($mod, 0, 1) eq '-') {
# remove the - $mod = substr($mod, 1);
# check if the module exists (and remove it from the list if it exists... if ( -f $self->{'lib_dir'} . $self->{'sep'} . 'Log2t' . $self->{'sep'} . 'input' . $self->{'sep'} . $mod . '.pm') { print STDERR "[DEBUG] Removing the module $mod.\n" if ($self->{'debug'} and defined($self->{'input_list'}->{$mod}));
delete($self->{'input_list'}->{$mod}) if exists($self->{'input_list'}->{$mod}); } else { print STDERR "[DEBUG] Module ($mod) does not exist.\n"; }
} else {
# add the module to the list if ( -f $self->{'lib_dir'} . $self->{'sep'} . 'Log2t' . $self->{'sep'} . 'input' . $self->{'sep'} . $mod . '.pm') { print STDERR "[DEBUG] Adding the module $mod.\n" if $self->{'debug'}; $self->{'input_list'}->{$mod}++; } else { print STDERR "[DEBUG] Module ($mod) does not exist.\n" if $self->{'debug'}; } } }
_load_input
A private method (not part of the public API).
_load_output
A private method (not part of the public API).
_get_module_help
A private method (not part of the public API).
_format_sort
A private method (not part of the public API).
A sorting 'algorithm' for input modules.
The problem with some of the input modules is that there might be two input modules that are capable of parsing the same file. And since the tool stops processing each file when a match is found you might end up parsing a file using a module that is not really suited to do so.
The most prelevant example is the exif module that is capable of extracting a generic metadata from vast amount of different types of files. There might be other modules that are specifically written to parse that particular file, which do a lot better job of extracting relevant data from it. This routine is therefore written to lower the priority of these more generic modules so that they do not parse files before the more specific ones do.
Currently the following modules do have lower priority associated to them:
_calc_offset
A private method (not part of the public API).
A sub routine that takes the offset value that is given to the API and converts it into an integer that is used to balance of the timestamps read.
The offset can be one of each values:
+ int: numbers of seconds (eg. 52 or -12)
+ string: An int with appended character indicating the unit of the int. Accepted values are h, m or s that correspond to hours, minutes and seconds. Examples: 52s or 1h (n.b. it is not possible to use 4h2m1s to represent the time in more granularity, it is only possible to use one string, making the int option most useful since offset rarely comes in whole hours.
No arguments are needed since the routine only uses and sets class variables.
Kristinn Gudjonsson <kristinn (a t) log2timeline ( d o t ) net> is the original author of the program.
The tool is released under GPL so anyone can contribute to the tool and examine the source code. Copyright 2009-2012.
Documentation for each input module follows the name of Log2t::input::MODULE and for output modules Log2t::output::MODULE
log2timeline, the Log2t::Time manpage, the Log2t::BinRead manpage, the Log2t::Common manpage, the Log2t::Network manpage, the Log2t::Numbers manpage, the Log2t::Win manpage, the Log2t::WinReg manpage