applications/file

bulk_extractor - bulk_extractor is a C++ program that scans a disk image, a file, or a directory of files and extracts useful information

Website: http://afflib.org/software/bulk_extractor
License: GPL
Description:
bulk_extractor is a C++ program that scans a disk image, a file, or a
directory of files and extracts useful information without parsing the
file system or file system structures. Useful information currently
includes email addresses, URLs, credit card numbers, EXIF data
structures, KML files, AES encryption keys (from RAM), IP packets, and
other kinds of forensicly important information. Results are stored in
text files (called feature files)
that can be easily inspected, parsed, or processed with automated
tools. The program is multi-threaded and will use all available
cores. bulk_extractor also created a histograms of features that it finds,
as features that are more common tend to be more important.

Packages

bulk_extractor-1.3.1-2.el5.i686 [2.2 MiB] Changelog by Lawrence Rogers (2012-11-27):
* Release 1.3.1-2
	Included necessary dependencies to build and install BEViewer
bulk_extractor-1.1.3-1.el5.i386 [62 KiB] Changelog by Morgan Weetman (2011-12-14):
* Release 1.1.3-1
	* src/xml.cpp: now works with older and newer versions of exiv2
	* src/histogram.cpp (HistogramMaker::add): looks for \000 in utf16 strings converted to utf8 and erases them (We were getting them in histograms)
	* src/scan_wordlist.cpp (wordlist_split_and_dedup): no longer adds zero-length words to wordlist
	* src/feature_recorder.cpp (feature_recorder::make_histogram): histograms no longer banner stamp or version stamp if there is no corresponding feature.
	* src/scan_net.cpp (pcap_writepkt): changed file extension from .dmp to .pcap for packets
	* src/bulk_extractor.cpp (phase1): added -A offset to add an offset.
	* src/bulk_extractor.cpp (phase1): added -Y  start-end notation in addition to -Y start notation.
	* src/feature_recorder.cpp (feature_recorder::write): added support for opt_offset_add to allow output to be shifted (for parallelizing across multiple systems.)
	* src/sbuf.h (class pos0_t): removed snprintf; now uses stringstream.
	  (operator +): changed most functions to take const & rather than a new object.
	* src/feature_recorder.cpp (feature_recorder::write): now always writes out the second \t for the context, even if there is no context.
	* configure.ac: added AC_PROG_CC AC_PROG_CXX and AC_PROG_INSTALL
	* src/Makefile.am (.flex.o): FlexLexer.h moved to MyFlexLexer.h to support CentOS where an out-of-date flex is installed.
	* src/bulk_extractor.cpp (process_path): fixed handling of /h and /r with -p option
	* configure.ac: removed pcap.h tests becuase its not needed
	* src/scan_email.flex (Host): now only writes domains>0.
	* src/scan_zip.cpp (scan_zip): zip components with no name are now given <NONAME>
	* src/scan_winprefetch.cpp (scan_winprefetch): modified to only write out prefect files with non-zero exec name
	* src/scan_net.cpp (scan_net): significant update --- I don't need libpcap to do packet carving!
	* src/image_process.cpp (sbuf_alloc): added a new iterator method it->pos0() returns the pos0 of the sbuf to be allocated by it->sbuf_alloc()
	  (sbuf_alloc): changed calloc to malloc for performance
	  (process_aff::sbuf_alloc): now thorws bad_alloc if an exception is encountered
	  (process_ewf::sbuf_alloc): now thorws bad_alloc
	  (process_raw::sbuf_alloc): now thorws bad_alloc
	* src/bulk_extractor.cpp: removed scanner_enabled().
	* src/Makefile.am (bulk_extractor_SOURCES): removed checkpoint.h
	* src/bulk_extractor.cpp (main): checkpoint removed; restarting now done through dfxml file.
	  (phase1): do_phase1 renamed phase1; just_phase1 renamed do_phase1. phase1 and phase2 flags removed. Now automatic.
	  (main): -2 option removed
	* src/image_process_fts.cpp (process_dir::process_dir): added E01 detection.
	* src/scan_email.flex (Host): fixed crashing bug on context extraction in MAKESTRING6.
	* configure.ac: fixed conforming/non-conforming test for strchr
	* src/bulk_extractor.cpp: added HTTP_EOL which is \r\n in Unix and Mac and 
	* src/histogram.cpp (HistogramMaker::looks_like_utf16): now recognizes both little-endian and big-endian UTF-16 strings and properly converts them.
	* regress.py (analyze): now enables all scanners including wordlist
	* python/bulk_extractor.py (BulkReport.open): openfile renamed open
	* src/bulk_extractor.cpp (process_find_file): now ignores lines that begin with #
	* src/scan_winprefetch.cpp (P): changed utf16_string to wstring (which is the standard).
	* src/scan_accts.flex: replaced unicode16_to_string with utf16to8
	* src/checkpoint.h (load): named and val no longer shadow values
	* src/histogram.h (>): big surprise: it turns out that you should not subclass STL containers! Who knew? Well, a lot of people, apparently:
	  http://stackoverflow.com/questions/4353203/thou-shalt-not-inherit-from-stdvector
	  http://stackoverflow.com/questions/245475/how-do-i-create-a-generic-stdvector-destructor
	  http://stackoverflow.com/questions/3601431/base-class-class-stdvector-has-a-non-virtual-destructor
	  http://stackoverflow.com/questions/1647298/why-dont-stl-containers-have-virtual-destructors
	* src/threadpool.cpp (threadpool): modified so that master and worker are now references, rather than pointers.
	* configure.ac (HAVE_PTHREAD): added warnings for C++
	* src/base64_forensic.cpp: cleaned up prototypes.
	* src/scan_aes.cpp (valid_aes256_schedule): updated off-by-one problem.
	  (valid_aes192_schedule): updated off-by-one problem.
	  (valid_aes128_schedule): updated off-by-one problem.

Listing created by Repoview-0.6.6-1.el6