A parallel File System for Networks
of Windows Workstations
José María Pérez
Jesús Carretero
José Daniel García
Félix García
Alejandro Calderón
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
(Computer Architecture, Communications and Systems Group)
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Outline
 Introduction
 Goals
 Design
 Evaluation
 Conclusion
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –2–
High performance and data storage
 Growing need for high performance data storage.
 Growing capacity of disks.
 Growing data storage from applications.
 I/O becomes in bottleneck.
 Typical solution: Parallel I/O
 Join several storage resources  Large storage.
 Increased scalability and performance.
 Load balancing.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –3–
Parallel File Systems
 Several nodes with storage devices.
 Accesses performed in parallel.
 Data striped among nodes.
 Striping allows:
 Parallel access to different files.
 Parallel access to the same file.
Striping originally used in RAID.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –4–
Current state
 Current solutions are neither general nor flexible.
 Do not use standard servers.
 Difficult to integrate in existing networks of workstations.
 Need to install new difficult servers.
 Available for specific platforms.
 Implementation outside the operating system.
 A new I/O API is needed.
 Applications need to be modified or recompiled.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –5–
Outline
 Introduction
 Goals
 Design
 Evaluation
 Conclusion
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –6–
WinPFS: Goal
 Build a parallel file system for networks of Windows
workstations using standard data sharing services (as
Windows Shared Folders).
A first prototype has been built using CIFS/SMB
servers.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –7–
Detailed goals
 Integrate existing storage resources using shared folders rather
than installing new servers.
 Accomplished by using Windows Redirectors.
 Simple setup.
 Implemented as a new Windows File System in the kernel (a new
stackable driver in the I/O hierarchy).
 Easy to use.
 No special API’s  Applications work without recompilation.
 Enhance performance, scalability and capacity.
 Request splitting, balanced data allocation, load balancing, ...
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –8–
Win32
....
MPI-IO
Client
WinPFS
NFS
CIFS
HTTP
WebDAV
Local
....
Redirectors
Clients
NFS
CIFS
HTTP-WebDav
...
Distributed partition 2
Intranet
Site 2
Site 1
Site 3
Distributed partition 1
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –9–
Outline
 Introduction
 Goals
 Design
 Evaluation
 Conclusion
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –10–
WinPFS design
 Design based in a new Windows kernel component: A
file system redirector.
 Implements the basis of the file system.
 Isolates users from the parallel file system
 Uses protocols to connect to different network file systems.
 Redirector  redirects requests to remote servers
with specific protocol (e.g.: CIFS/SMB).
 WinPFS is registered as a virtual remote file system,
implement the parallel I/O mechanisms and use
other remote data services (redirectors).
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –11–
Win32
POSIX
DOS
Native NT API
I/O Manager
WinPFS
Local
CIFS
WebDav
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Netware
NFS
Page –12–
Remote data access
User point of view
Kernel point of view
 Access to remote data
through shared folders.
 Access through CIFS/SMB, …
 WinPFS creates a new
shared folder: \\PFS.
 Capture requests through
the usage of Universal
naming Convention (UNC).
 Users can access parallel
files through this shared
folder.
 Special kind of file system: a
redirection of redirector
drivers.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –13–
File striping and requests
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –14–
Layered I/O
 Windows NT family has a layered I/O model.
 Several layers to process a request in the I/O subsystem.
 Each layer is a driver which can receive a request and pass it
to lower layers in the I/O stack.
 The model allows the insertion of new layers, using new
drivers.
 File systems are implemented as drivers in the I/O model, so
new file systems can be added at the kernel level.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –15–
I/O Request Management
 IRP (I/O Request Packet)
 Describes an I/O request.
 Sent to kernel-mode drivers by I/O Manager (in behalf of the
client).
 I/O Manager
 Receives system calls.
 Creates IRP describing the request.
 Deliver the IRP to the appropriate driver.
 MUP (Multi UNC Provider)
 Identifies the kernel-mode driver in charge for a network
name.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –16–
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –17–
Request management
 Create: IRP’s are replicated and sent to each server.
 Read/Write: Request split in smaller subrequests.
 Create Directory: IRP’s are replicated and set to
each server.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –18–
Using WinPFS
 Administration / Installation:
 Install a new driver in client nodes.
 Share folders in server nodes.
 Indicate shared folders using registry in client nodes.
 User
 Prefix paths with \\PFS.
We plan to map remote names to common driver letters.
 WinPFS may be used with any API that is on top of Windows
Services.
 Win32, POSIX, DOS, cygwin, …
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –19–
Other Features
 Caching.
 Caching mechanisms performed by redirectors.
 Limited to Windows caching model.
 More advanced caching for future work.
 Security and Authentication
 Current model works on a Windows Domain, forests and trusted
domains. Standard Windows mechanisms used to managed policies
and security in enterprises, labs and departments.
 Uses standard Windows security model.
 Changes to be done for workgroup or not trusted domains.
 Data consistency between clients.
 Currently only solved for all servers using CIFS, using the default
mechanism used by CIFS redirector  oplocks (oportunistic locks).
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –20–
Outline
 Introduction
 Goals
 Design
 Evaluation
 Conclusion
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –21–
Evaluation
 Creating a file of 100 MB.
 Write sequentially.
 Read sequentially.
 Static buffer size.
 Client cache disabled.
 Two clusters with four
nodes.
 Node
 BiProcessor Pentium III.
 1 GHz.
 1 GB main memory.
 200 GB disk.
 GigaEthernet network
 1 Windows 2003 Server
 7 Windows XP Professional
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –22–
Evaluation infrastructure
PC
PC
PC
GigaEthernet
Switch
GigaEthernet
Switch
PC
PC
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
PC
PC
PC
Page –23–
Configurations
 CIFS: One server.
 PFS88: 8 servers in parallel.
 PFS44: 4 servers in parallel.
 PFS84: 4 servers in parallel and selected randomly
from a set of 8.
In all cases 8 clients running (1 client per node)
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –24–
Write results
300
CIFS
250
PFS88
PFS44
Throughput (Mbits/S)
200
PFS84
150
100
50
0
1K
2K
4K
8K
16K
32K
64K
128K
256K
512K
1M
Buffer Size (Bytes)
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –25–
Read results
1400
CIFS
PFS88
1200
PFS44
PFS84
Throughput (Mbits/S)
1000
800
600
400
200
0
1K
2K
4K
8K
16K
32K
64K
128K
256K
512K
1M
Buffer Size (Bytes)
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –26–
Results
 All WinPFS solutions provide better results than CIFS.
 PFS88 provides the best performance as its
parallelism degree is maximum.
 Performance reaches to 250 Mbit/s for write
operations. Writes limited by the disks.
 Performance reaches to 1200 Mbit/s for read
operations. Reads limited by the network.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –27–
Write Speedup PFS88/CIFS
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –28–
Read Speedup PFS88/CIFS
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –29–
Speedup results
 Speedup is higher with more concurrent clients.
 Write speedup from 500% to 700% may be
achieved.
 Read speedup is less 100% because data are
obtained from server caches without disk accesses.
 WinPFS performance is limited by the striping size
buffer.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –30–
Outline
 Introduction
 Goals
 Design
 Evaluation
 Conclusion
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –31–
Conclusions
 WinPFS is a parallel file system implemented as a kernelmode driver.
 Integration into the kernel provides higher performance.
 Uses existing mechanisms at kernel level.
 No change or recompilation needed in client applications.
 We can run an application that uses parallel I/O taking
advantages of the shared folders in our organizations,
without affecting users.
 For example, launch an I/O intensive application in a
classroom, and accessing to shared folders.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –32–
Future Work
 Use of Active Directory Service to create metadata repository and
give a consistent image of parallel file systems.
 Objective: No need of manual edition of client registry to provide
information about shared folders.
 Evaluation with other operating systems.
 Linux, FreeBsd and Solaris sharing folders with Samba.
 Evaluation with other protocols (redirectors).
 NFS (redirector provided by Services for UNIX 3.5) and WebDAV.
 So, a WinPFS can connect to more servers, including NAS.
 Parallel usage of heterogeneous resources and protocols in networks
of workstations.
 Dynamically addition and removal of storage nodes.
 Data allocation and load balancing for heterogeneous distributed
systems.
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas
http://www.arcos.inf.uc3m.es
UNIVERSIDAD CARLOS III DE MADRID
Expanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids
Page –33–
Download

Parallel File System for Networks of Windows Workstations