Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- MISC: Increased Diaphora version to 3.0.0.
- API: Add support for the first 'to be exposed' function, used to get the callgraph percent difference.
- CORE: Added CodeCut support to find anonymous compilation units.
- CORE: Added IDAMagicStrings to try to find compilation unit names.
- CORE: Added support to find and export Compilation Units.
- CORE: Coallesce contiguous named compilation units using the minimum and maximum address.
- CORE: Do not directly add matches to choosers, instead, work with internal Python dict objects and process them when the diffing session is done.
- CORE: More refactorizations for properly supporting multimatches and finding the best matches.
- CORE: Set a name to compilation units when enough matches indicate the name of the compilation unit using IDAMagicStrings.
- DATABASE: Added proper indices and fine tunning of SQL heuristics and queries.
- DATABASE: Moved tables and indices definitions to a different file.
- DIFF: Support for handling mutiple matches by showing them in a different chooser.
- DOC: Documented all functions and members in diaphora.py.
- EXPORT: Add the `func_id` field to the `instructions` table.
- EXPORT: Consider data references to functions from functions also code references.
- GUI: Add support for Python logging facilities.
- GUI: Added environment variable DIAPHORA_LOG_PRINT to print to stdout instead of using Python logging facilities.
- GUI: Enable, by default, slow heuristics only for databases of ~1,000 functions at much.
- GUI: Renamed 'Experimental' to 'Enable Speed Ups', as the old 'experimental' heuristics are either upgraded to 'normal' or removed.
- HEUR: Add a filter for a minimum of 3 instructions for heuristic 'Same address, nodes, edges and mnemonics'.
- HEUR: Add support for speed ups (internally called 'dirty heuristics') for detected symbols stripped matching and patch diffing.
- HEUR: Added a default ORDER BY clause to order by compilation unit when there is a named compilation unit.
- HEUR: Added a minimum ratio of 0.35 for heuristic 'Pseudo-code fuzzy AST hash'.
- HEUR: Added a minimum ratio of 0.5 for heuristic 'Pseudo-code fuzzy (normal)'.
- HEUR: Added heuristic 'Same rare basic block mnemonics list'.
- HEUR: Added heuristic 'Local affinity' to find matches in functions gaps.
- HEUR: Added heuristic 'Same anonymous compilation unit function match'.
- HEUR: Added heuristic 'Same compilation unit'.
- HEUR: Added heuristic 'Same named compilation unit function match'.
- HEUR: Added heuristic type HEUR_TYPE_RATIO_MAX_TRUSTED. Results with a bad similarity ratio are assigned to the 'Partial' tab regardless of the calculated ratio.
- HEUR: Added self-explanatory new heuristic 'Same rare assembly instruction'.
- HEUR: Added support to find matches diffing assembly and pseudo-codes of previous known good matches.
- HEUR: All heuristics now select the same fields by calling `diaphora_heuristics.get_query_fields()` to retrieve the fields.
- HEUR: Allow the heuristic 'Same rare constant' to match functions with at least 3 basic blocks.
- HEUR: Always consider functions matching by name the best match, no matter of the ratio that another match might produce.
- HEUR: Changed heuristic 'Same nodes, edges and strongly connected components' to 'Same nodes, edges, loops and strongly connected components'. Now loops are also considered for matching.
- HEUR: Changed heuristic 'Similar pseudo-code and names' to only consider results with a similarity ratio higher than 0.579.
- HEUR: Consider matches only for symbol names that have at least 4 characters for heuristic 'Callee found finding matches'.
- HEUR: Consider the first match for heuristic 'Local affinity' function the best match.
- HEUR: First proper working version (hopefully) of the support for multimatches.
- HEUR: Increased the number of decimal numbers (7) used for comparison ratios.
- HEUR: Increased the queries timeout to 5 minutes.
- HEUR: Marked heuristic 'Same rare constant' as slow.
- HEUR: Moved heurisitc 'Brute force' to the unreliable category.
- HEUR: Moved heuristic 'Same graph' to the unreliable category.
- HEUR: Moved heuristic 'Nodes, edges, complexity and mnemonics with small differences' to the slow ones.
- HEUR: Moved the 'Experimental' heuristics to the 'Partial' category.
- HEUR: Order by address the functions for heuristic 'Local affinity', as compilers usually put functions in the same order in binaries.
- HEUR: Relax the heuristic 'Same rare constant' to allow good matches with a bad similarity ratio to appear in the 'Partial' tab.
- HEUR: Removed heuristic 'Bytes hash and names'.
- HEUR: Removed heuristic 'Strongly connected components SPP and names'.
- HEUR: Removed heuristics that were not finding anything, namely, 'All or most attributes', 'Same address, nodes, edges and primes (re-ordered instructions)', 'Strongly connected components small-primes-product' and 'Callgraph match'.
- HEUR: Removed old wrong and buggy heuristic 'Call address sequence'.
- HEUR: Removed the slow flag from heuristics 'Switch structures', 'Pseudo-code fuzzy XXX' and 'Same graph'.
- HEUR: Removed unreliable heuristic 'Bytes sum'.
- HEUR: Rewrite heuristics 'Same rare KOKA hash' and 'Same rare MD-Index' to use the WITH clause that makes queries much more readable and maintainable.
- HEUR: Run slow heuristics at the very end of the diffing process, after the other heuristics.
- HEUR: Run the only 2 remaining 'unreliable' heuristics at the very end of the diffing process.
- HEUR: The DISTINCT and/or the ORDER BY clauses have been removed in some SQL heuristics because they were causing some queries to never finish triggering SQLite memory errors.
- HEUR: Use difflib.unified_diff insted of ndiff because the later is way too slow to call it hundred of thousands of times.
- HEUR: When diffing matches to find callees ignore those matches that differ more than 75% of the number of basic blocks.
- MISC: Added a 'Diaphora:' prefix for log messages.
- MISC: Change the text for the 'Call Address sequence' heuristic to show which initial results the matches are based on.
- MISC: Fixed some minor typos in the sources.
- MISC: Multiple little refactorizations here and there.
- MISC: Renamed heuristic 'Same cleaned up assembly' to 'Same cleaned assembly'
- BUG: Added the n-th fix to try not to leak cursors at all ever.
- BUG: All parallel calls to add_matches_from_query_ratio_max() were wrong.
- BUG: Always use the internal dicts for handling matches, never use the choosers except for adding the results at the end.
- BUG: Commit every transaction that must be committed.
- BUG: Do not analyze the databases each time a diff is started.
- BUG: Do not consider IDA's auto-generated names for the 'Same RVA' heuristic.
- BUG: Do not crash when there is no chooser (it's None) given for a specific category.
- BUG: Do not directly call 'sqlite3.connect()', instead call a wrapper that does whatever initialization is required.
- BUG: Handling timeouts in threads was horribly wrong because there was no code to handle the timeout inside the thread...
- BUG: Hopefully final fix for issue #5.
- BUG: If a reverser selected File -> Save As from the menu Diaphora would fail to find the .til file and it would crash.
- BUG: Instruction level import support was very wrong, even with typos.
- BUG: Multiple instances of functions leaking cursors were fixed.
- BUG: Regular expression pattern in `get_cmp_asm` wasn't properly escaped.
- BUG: Removing items from choosers in IDA was broken.
- BUG: Some SQL queries were not able to properly execute due to huge B-TREEs being created by SQLite when diffing huge databases.
- BUG: Some comparisons (pseudocode and graph) were being shown, wrongly, in a different order than the others.
- BUG: Some heuristics were trying to filter with a wrong SQL expression functions starting with the 'sub_' prefix.
- BUG: The check to determine if Diaphora should continue finding more callees diffing previous results was wrong.
- BUG: The environment variables for multiple items were not being properly handled.
- BUG: The logic to handle unmatched choosers was being handled the wrong way (the other way around), which was pretty confusing.
- BUG: The members `get_cmp_asm` and `get_cmp_pseudo` were being called hundreds of thousand of times for no reason when diffing.
- BUG: There were still many places were Diaphora could leak cursors in diaphora_ida.py.
- BUG: When calling the function `check_ratio()` always convert, internally, to float the values of the MD-Indices.
- BUG: Workaround implemented for the IDA bug 'max non-trivial tinfo_t count has been reached' that might be triggered when exporting immense databases.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement