A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Wednesday, March 8, 2017

Getting Windows share via python

Windows network shares with python

Backstory

On a daily basis, we receive data extracts from a mainframe. They provide a header and data file for whatever the business users want to explore. This client has lots of old data ferreted away and they need to figure out if there's value in it. Our job is to consume the header files to drop and create tables in SQL Server and then populate with actual data. The SQL is trivial -

CREATE TABLE Foo (Col1 varchar(255), ColN varchar(255)); 
BULK INSERT Foo FROM 'C:\sourceFile.csv' WITH (FIRSTROW=1,ROWTERMINATOR='\n',FIELDTERMINATOR='|');

Let's make this harder than it should be

Due to ... curious permissions and corporate politics, the SQL Server service account could only read files via a network share (\\Server\Share\Input\File.csv), never you no mind the fact that path was really just D:\Share\Input. A local drive but permissions were such that we couldn't allow the service account to read from the drive. Opening a network share up and letting the account read from that - no problem.

What are the shares?

That's an easy question to answer, because I knew the answer. net share. I coded up a simple parser and all was well and good until I ran it on the server which had some really, long share names and/or the Resource was long. Like this

Share name   Resource                        Remark

-------------------------------------------------------------------------------
C$           C:\                             Default share
IPC$                                         Remote IPC
ADMIN$       C:\WINDOWS                      Remote Admin
DEV2016      \\?\GLOBALROOT\Device\RsFx0410\\DEV2016
                                             SQL Server FILESTREAM share
RidiculouslyLongShareName
             C:\users\bfellows\Downloads
The command completed successfully.
Super. The output of net share is quasi fixed width and it just wraps whatever it needs to onto the next line/column.

What are the sharesv2

Windows Management Instrumentation to the rescue! WMIC.exe /output:stdout /namespace:\\root\cimv2 path Win32_Share GET Name, Path That's way better, sort of

Name                       Path
ADMIN$                     C:\WINDOWS
C$                         C:\
DEV2016                    \\?\GLOBALROOT\Device\RsFx0410\\DEV2016
IPC$
RidiculouslyLongShareName  C:\users\bfellows\Downloads
Originally, that command ended with GET * which resulted in a lot more information being returned than I needed. The devil though, is that the output width is dependent upon the source data. If I remove the network share for my RidiculouslyLongShareName and rerun the command, I get this output
Name     Path
ADMIN$   C:\WINDOWS
C$       C:\
DEV2016  \\?\GLOBALROOT\Device\RsFx0410\\DEV2016
IPC$
Users    C:\Users
It appears to be longest element +2 spaces for this data but who knows what the real encoding rule is. The good thing is, that while variable, the header rows gives me enough information to slice up the data as needed.

This needs to run anywhere

The next problem is that this process in Dev runs on D:\Share but in QA is is on the I:\datafiles\instance1 and oh by the way, there are two shares for the I drive \\qa\Instance1 (I:\datafiles\instance1) and \\qa\datafiles. (I:\datafiles) In the case where there are multiple shares, if there's one for the folder where the script is running, that's the one we want. Otherwise, it's probably the "nearest" path which I interpreted as having the longest path.

Code good

Here's my beautiful, hacky python. Wherever this script runs, it will then attempt to render the best share path to the same location.

import os
import subprocess

def _generate_share_dictionary(headerRow):
    """Accepts a variable width, white space delimited string that we attempt
        to divine column delimiters from. Returns a dictionary of field names
        and a tuple with start/stop slice positions"""

    # This used to be a more complex problem before I realized I didn't have
    # to do GET * in my source. GET Name, Path greatly simplifies
    # but this code is generic so I keep it as is

    header = headerRow
    fields = header.split()
    tempOrds = {}
    ords = {}
    # Populate the temporary ordinals dictionary with field name and the
    # starting, zero based, ordinal for it.
    # i.e. given
    #Name     Path
    #01234567890123456789
    # we would expect Name:0, Path:9
    for field in fields:
        tempOrds[field] = headerRow.index(field)

    # Knowing our starting ordinal positions, we will build a dictionary of tuples
    # that contain starting and ending positions of our fields
    for iter in range(0, len(fields) -1):
        ords[fields[iter]] = (tempOrds[fields[iter]], tempOrds[fields[iter+1]])
        
    # handle the last element
    ords[fields[-1]] = (tempOrds[fields[-1]], len(headerRow))

    return ords

def get_network_shares():
    """Use WMIC to get the full share list. Needed because "net share" isn't parseable"""
    _command = r"C:\Windows\System32\wbem\WMIC.exe /output:stdout /namespace:\\root\cimv2 path Win32_Share GET Name, Path"
    #_command = r"C:\Windows\System32\wbem\WMIC.exe /output:stdout /namespace:\\root\cimv2 path Win32_Share GET *"
    _results = subprocess.check_output(_command, shell=True).decode('UTF-8')

    _headerRow = _results.splitlines()[0]
    headerOrdinals = _generate_share_dictionary(_headerRow)

    _shares = parse_network_shares_name_path(headerOrdinals, _results)
    return _shares

def parse_network_shares_name_path(header, results):
    """Rip apart the results using our header dictionary"""
    _shares = {}
    #use the above to slice into our results
    #skipping first line since it is header
    for _line in results.splitlines():
        if _line:
            _shares[_line[header["Name"][0]: header["Name"][1]].rstrip()] = _line[header["Path"][0]: header["Path"][1]].rstrip()
    return _shares
    

def translate_local_path_to_share(currentPath):
    """Convert the supplied path to the best match in the shares list"""
    shareName = ""
    defaultShare = ""
    shares = get_network_shares()

    # find the first share match
    if currentPath in shares.values():
        shareName = [key for key, value in shares.items() if value == currentPath][0]
    else:
        #see if we can find a partial match
        # favor longest path
        best = ""
        pathLength = 0
        for share, path in shares.items():
            # path can be empty due to IPC$ share
            if path:
                # Is the share even applicable?
                if path in currentPath:
                    # Favor the non default/admin share (DriveLetter$)
                    if share.endswith('$'):
                        defaultShare = currentPath.replace(path[:-1], share)
                    else:
                        if len(path) > pathLength:
                            shareName = currentPath.replace(path[:-1], share)

        # No other share was found
        if (defaultShare and not shareName):
            shareName = defaultShare
    x = os.path.join(r"\\" + os.environ['COMPUTERNAME'], shareName)
    print("Current folder {} maps to {}".format(currentPath, x))
    
    return os.path.join(r"\\" + os.environ['COMPUTERNAME'], shareName)


def main():
    
    current = os.getcwd()
    #current = "C:\WINDOWS"
    share = translate_local_path_to_share(current)
    print("{} aka {}".format(current, share))

if __name__ == "__main__":
    main()

Takeaways

You probably won't ever need all of the above code to be able to swap out a local path for a network share using python but by golly if you do, have fun. Also, python is still my most favorite language, 14 years running.

No comments: