Tuesday, March 30, 2010

Insert, delete space at any place in file without making temporary file copy

Goal:
Implement methods that insert or delete space at any place in file (affecting file length), using technique that copies content within file so there is no need for temporary file.

Implementation idea:
When inserting space of N bytes:
    1. Extend file size by N bytes
    2. Copy/shift all data after insertion position, N bytes towards file's end
    When deleting space of N bytes:
        1. Copy/shift all data after deletion position + N bytes, N bytes towards file's beginning
        2. Shrink file size by N bytes

    Why?
    Framework and system file API offer functions to write new file or append to existing, there are no functions to insert new data (and not overwriting existing) in existing file at some arbitrary position e.g. beginning of file or middle of file. Also, there are no functions for removing data in similar way.

    Usual solutions that I've seen tend to (to simulate insert):
    1. make temporary file
    2. append data before insert position from original file
    3. append data to insert
    4. append data after insert position from original file
    5. delete original file
    6. rename temporary file to original file's name
    Similar technique is used to simulate deletion of part of file.

    Problem with that approach is:
    • choosing name for temporary file
    • need for disk space for original and temporary file
    • copying of original data positioned before insert/delete position (no need for that)
    Usage example:

    Legend:
    Unchanged
    Copied
    Filled/Inserted

    Notes:
     - Each step works on resulting data from previous step.
     - Inserted part can be also interpreted as unchanged data, but logically it is treated as uninitialized as it is expected that user will write new data after the part is inserted.


    Initial file stream (letter 'A' is at position 0)
    FileStream file = new FileStream("test.txt", FileMode.Create, FileAccess.ReadWrite);
    byte[] initial = Encoding.ASCII.GetBytes("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
    file.Write(initial, 0, initial.Length);
    
    Result: ABCDEFGHIJKLMNOPQRSTUVWXYZ


    Insert (make room for) 4 bytes at position 7(letter 'H')
    Quazistax.QSFile.InsertFilePart(file, 7, 4);
    
    Result: ABCDEFGHIJKHIJKLMNOPQRSTUVWXYZ


    Fill 4 bytes at position 7 with '='
    Quazistax.QSFile.FillFilePart(file, 7, 4, (byte)'=');
    
    Result: ABCDEFG====HIJKLMNOPQRSTUVWXYZ


    Delete 7 bytes at position 16 (part 'MNOPQRS', letter 'M' is at position 16)
    Quazistax.QSFile.DeleteFilePart(file, 16, 7);
    
    Result: ABCDEFG====HIJKLTUVWXYZ


    Copy 8 bytes from position 5 (letter 'F') to position 14 (letter 'K')
    Quazistax.QSFile.CopyFilePart(file, 5, 14, 8);
    
    Result: ABCDEFG====HIJFG====HIZ

    Test:
     
    Source code:
    You can download the source here.

    No comments:

    Post a Comment