Splitter
Diff splitting implementation for CodeMap.
logger
module-attribute
logger = getLogger(__name__)
MAX_DIFF_CONTENT_LENGTH
module-attribute
MAX_DIFF_CONTENT_LENGTH = 100000
MAX_DIFF_LINES
module-attribute
MAX_DIFF_LINES = 1000
SMALL_SECTION_SIZE
module-attribute
SMALL_SECTION_SIZE = 50
COMPLEX_SECTION_SIZE
module-attribute
COMPLEX_SECTION_SIZE = 100
DiffSplitter
Splits Git diffs into logical chunks.
Source code in src/codemap/git/diff_splitter/splitter.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 |
|
__init__
__init__(config_loader: ConfigLoader | None = None) -> None
Initialize the diff splitter.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config_loader
|
ConfigLoader | None
|
ConfigLoader object for loading configuration |
None
|
Source code in src/codemap/git/diff_splitter/splitter.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
|
config_loader
instance-attribute
config_loader = config_loader
similarity_threshold
instance-attribute
similarity_threshold = similarity_threshold
directory_similarity_threshold
instance-attribute
directory_similarity_threshold = (
directory_similarity_threshold
)
min_chunks_for_consolidation
instance-attribute
min_chunks_for_consolidation = min_chunks_for_consolidation
max_chunks_before_consolidation
instance-attribute
max_chunks_before_consolidation = (
max_chunks_before_consolidation
)
max_file_size_for_llm
instance-attribute
max_file_size_for_llm = max_file_size_for_llm
max_log_diff_size
instance-attribute
max_log_diff_size = max_log_diff_size
split_diff
async
Split a diff into logical chunks using semantic splitting.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
diff
|
GitDiff
|
GitDiff object to split |
required |
Returns:
Type | Description |
---|---|
tuple[list[DiffChunk], list[str]]
|
Tuple of (List of DiffChunk objects based on semantic analysis, List of filtered large files) |
Raises:
Type | Description |
---|---|
ValueError
|
If semantic splitting is not available or fails |
Source code in src/codemap/git/diff_splitter/splitter.py
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
|