|
Digitization at Panjab Digital Library :: Digital Preservation at PanjabDigiLib
 |
 |
|
|
Digital preservation is the most viable and the only major technological
alternative available to us for safe guarding our fast diminishing heritage.
In order to contribute to any preservation effort and bring about a positive
change, technological advances have to be met with equal amount of concern,
enthusiasm, will to take concrete steps and initiative in the right
direction.
PDL from its very inception has been at the helm of preservation effort,
aimed as saving our heritage and salvaging position of Sikhs and Panjabis as
a cultural identity. Providing digitization services to individuals and
institutions since 2003, it has been instrumental in digitally archiving
thousand of originals of manuscripts, rare books and other literature thus
far.
PDL provides digitization services both on site and off site, depending on
the constraints. If the material to be digitized warrants special care and
cannot be transported to the PDL offices due some reasons, then PDL provides
its digitization services right at the place where the material is situated.
In such cases, the equipment and the staff is temporarily moved to the close
vicinity of the place to carry out the project.
If the material is good enough to be transported and/or the custodian is
willing to send it to the PDL offices, it is carried out at the PDL official
center itself. The quality of the digitized output does not change in either
of the cases, but due to the benefit of complete infrastructure being
available, projects tend to finish in a shorter time span when carried at
the PDL offices, thus also reducing the overall project cost.
The output of the digital projects may vary, but under standard conditions
it includes, RAW and jpeg images of the originals. Other derivate like TIFF,
OCR etc. may also be extracted from them depending on the need. The complete
digitized material is handed over to the custodian compiled in DVDs and/or
Hard disk as per requirements and official agreement.
Individuals and institutions interested in getting their collections
digitized can contact the PDL representative either by filling in the
requisite form available on the site or write to support@panjabdigilib.org. Once PDL
receives a letter of interest, it initiates further procedures necessary for
the undertaking a digitization project.
|
|
|
Staffing |
Human resource is the most essential part of any institutional effort. To
have a trained staff that can handle a variety of tasks and take informed
decision under different situation is vital to any organization. The PDL has
acquired and developed much of its required skilled man power to sustain the
digital project. The PDL staff includes a dedicated group of programmers
that has developed its online digital library and maintains it as well.
Apart from that, PDL has also employed computer professionals, highly
skilled digitization and post digitization processing staff to handle a
capacity of up to 10,000 folios per day. There is also a group of
professors, language and library experts working part to full time for
metadata creation and research purposes.
PDL also hires short term local employees for assistance in onsite
digitization projects in other cities. These people are adequately trained
to help in various digitization processes, usually generating a class of
?digitization literate? people who can be readily employed to adapt to other
digital projects without much investments. |
|
|
Infrastructure |
The IT revolution has changed the way we store and access information. The previous models of monolithic concrete buildings as the store houses of information is giving way to virtual spaces, holding multiple digital repositories. This transition has also affected the infrastructure required of digital libraries.
With time, as the content of digital information is rapidly increasing so is the need for greater storage and retrieval power. This has had significant impact in terms of software and hardware infrastructure requirements.
PDL continues to expand its infrastructure to match the service quality it wishes to provide to its user base. Most of this infrastructure is based on scalable and interoperable models.
With about 30 installed workstations PDL has the highest number of data
processing power in the region as an NGO. A good amount of digitization
equipment is in used, which includes multiple cameras and lighting
equipments for varying and specific digitization needs, flat bed scanners
with capacity ranging from A4 size papers to 42? wide formats.
PDL has developed on its own much of the in house digitization equipment in
use. Lately, it has successfully created specialized customizable
digitization worktables and a complete digitization apparatus with
integrated lighting system and computer systems for wide screen live view
and real time simultaneous data transfer. With these equipments PDL intends
to increase both the output and the quality of the digitization work being
undertaken, and possibly reach out to more people with its services in a
short span of time. |
Equipment in Use
Scanners/Cameras |
Wide format scanner which can scan document
width up to 42 inches
10 Digital SLR cameras
2 scanners 8.5" x 14"
4 scanners 8.5" x 11.5"
2 Book scanners with V shaped cradle
6 Dark rooms with lighting equipment
|
Servers |
IBM server with an installed storage of 20
TB scalable up to 48 TB
|
Backup Equipment |
LTO4 Tape drives
Hard Drives
DVD writers
|
Computers |
25 Desktops
10 Laptops
|
|
|
PDL has the highest capacity to produce, edit and backup digital data in the region of any other Non-Governmental Organization.
PDL has independently developed much of the support equipment required for digitization. Lately, it has successfully created specially customized digitization worktables and a complete digitization apparatus with integrated lighting system and computer systems for wide screen live view and real time simultaneous data transfer. With the installed equipment, PDL intends to increase both the output and the quality of the digitization work being undertaken, and to digitally preserve more manuscripts in a short span of time. |
|
|
Digitization process |
Though at the surface
digitization process may seem as simple as clicking a few frames of
the object, it involved a number of procedures that further entail
multiple level traversing of data through various hands and eyes
before it is deemed fit and final conforming to all given standards.
Although the process required meticulous work, it should not be deemed
complex which is out of the purview of mass effort. However, it does
require a certain level of training before a hand can handle a
manuscript and an eye can judge the quality of an image. Following is
a general procedure that is followed in the digitization of a
document.
Document Assessment: Originals are checked for
- Condition
- Good or bad
- Fragile or sturdy
- Binded or unbinded
- Margin in text, if not then is
the text going in side the binding girdle
- Size of the manuscript
- Religion sanctity (for proper
respect)
- Significance to see need and
viability of undertaking digitization
- Age
- Historicity
- Illustrated
Accession number and metadata
generation:
An accession number is allotted to the original source as soon as it
enters the PDL office, or the document is undertaken to be digitized.
Metadata for the document is recorded or generated if already not
present
Choosing and setting up of digitization equipment:
Digitization equipment is chosen and setup depending on the type of
document type and digitization needs.
Digitization:
Source material is thoroughly cleaned before digitization.
Scanning: Scanning is only done in cases where the size of the subject
is either too large or resolution requirements is exceptionally high.
Photography with camera is the standard practice for digitization of
the majority of subjects.
Photography:
Light source, source object, and the camera are three major things
whose relative positioning and other issue hold a very critical
significance for effective and better quality digitization. Avoiding
of light and dark areas on the manuscripts, reflections, rounding of
the manuscripts are a few of the common issues to be kept in mind.
Photographic digitization involves taking pictures at highest
resolution, keeping in view all the factors affecting the color,
texture, other attributes of the source documents, lighting conditions
and the equipment used, to ensure the closest digital image
representation of the original.
Quality checks:
Standard quality checks are performed on the images after capturing,
which include checking for any missing images, image completeness,
blur, shake, color tone, orientation etc. Errors are retaken, and the
images rechecked to ensure high standards.
Post capturing:
Renaming: All digital files are renamed as per the page sequence of
the original manuscript.
Archiving: Original (master) files are saved (backed up) as they are
captured. Copies are given for further processing.
Rotation: Images are rotated if they were taken perpendicular to the
camera frame to achieve highest resolution.
Cropping: All images are cropped to remove unwanted areas of adjacent
folios, other sides and the background.
Skewing: The images are then corrected thru skew for wrong
orientations still left. They are rectified to the acceptable level of
4-degree tilt.
Resizing: Images are appropriately resized to match the average image
dimension for putting them in the presentation.
Watermarking: All images are then watermarked to avoid its possible
misuse. Watermark is a stamp which is placed over every image without
tampering/disrupting its text.
Presentation:
A presentation of the above is created in a specially made stand alone
manuscripts viewer package (Personal Digital Library) that does not
require separate image viewing software to view the images for the
convenience of the clients.
Backups:
Five backups of the data are created for archiving purposes in three
different formats stored at three different locations for safety
reasons. The backup mediums include three copies of DVDs, two copies
of LTO4 tape drives, one Hard Disk apart from the offline server at
the PDL office.
Project Output:
A completely digitized manuscript with relevant metadata put in a
convenient and secure presentation form (Personal Digital Library),
apart from its archived copies. The original source material along
with a digital copy of it is restored to the custodian of the
material.
Website:
The thumbnails images of the collection generated during the process
of digitization are appropriately processed, tagged and uploaded to
the online digital library. |
|
|
|
|
|
|
|