Merge XML files (Italiano)

Download ApplyMergeXSLT.zip

ApplyMergeXSLT is a Java application that allows merging a set of UTF-8 XML files having the same root element. The application presents the following file system structure:

 -- ApplyMergeXSLT
	-- bin
	-- lib
	-- src
	-- X_OUTPUT
	-- X_SOURCE
	-- xslt
	-- run.bat
	-- run_wrapper.bat					
			

The directory X_SOURCE must contain the input XML files we are going to merge. It can contain directly the source XML files, or a directories tree whose leaves are the XML files (i.e. the procedure is recursive in the file system).

The two .bat files are the Windows access point to the application (it is easy to write a .sh executable file to work on a Unix system). ”run.bat” should be used when we want in output an XML file that merges the input XMLs maintaining the common root element. For example:

File1.xml
<Publisher>
    <Journal>
	<JournalInfo>
		<JournalElectronicISSN>2080-2218</JournalElectronicISSN>
		<JournalTitle>Advances in Cell Biology</JournalTitle>
	</JournalInfo>
    </Journal>
</Publisher>

File2.xml
<Publisher>
    <Journal>
	<JournalInfo>
		<JournalPrintISSN>0004-1254</JournalPrintISSN>
		<JournalTitle>Archives of Industrial Hygiene</JournalTitle>
	</JournalInfo>
    </Journal>
</Publisher>

merged.xml
<Publisher>
    <Journal>
	<JournalInfo>
		<JournalElectronicISSN>2080-2218</JournalElectronicISSN>
		<JournalTitle>Advances in Cell Biology</JournalTitle>
	</JournalInfo>
    </Journal>
    <Journal>
	<JournalInfo>
		<JournalPrintISSN>0004-1254</JournalPrintISSN>
		<JournalTitle>Archives of Industrial Hygiene</JournalTitle>
	</JournalInfo>
    </Journal>
</Publisher>
			

The other executable file, “run_wrappper.bat”, should be used when we want to maintain the root element, but we want also to wrap each source document into a wrapper element (for example because under the root of each XML there are more sons). As an example:

File1.xml
<Publisher>
	<PublisherInfo>
		<PublisherName>Birkhäuser-Verlag</PublisherName>
	</PublisherInfo>
	<Journal OutputMedium="All">
		<JournalInfo JournalProductType="ArchiveJournal">
			<JournalPrintISSN>0004-069X</JournalPrintISSN>
			<JournalElectronicISSN>1661-4917</JournalElectronicISSN>
			<JournalTitle>Archivum Immunologiae</JournalTitle>
		</JournalInfo>
	</Journal>
</Publisher>

File2.xml
<Publisher>
	<PublisherInfo>
		<PublisherName>Birkhäuser-Verlag</PublisherName>
	</PublisherInfo>
	<Journal OutputMedium="All">
		<JournalInfo JournalProductType="ArchiveJournal">
			<JournalPrintISSN>0004-069X</JournalPrintISSN>
			<JournalElectronicISSN>1661-4917</JournalElectronicISSN>
			<JournalTitle>Archivum Immunologiae</JournalTitle>
		</JournalInfo>
	</Journal>
</Publisher>

merged.xml
<Publisher>
	<wrapper>
		<PublisherInfo>
			<PublisherName>Birkhäuser-Verlag</PublisherName>
		</PublisherInfo>
		<Journal OutputMedium="All">
		    <JournalInfo JournalProductType="ArchiveJournal">
			<JournalPrintISSN>0004-069X</JournalPrintISSN>
			<JournalElectronicISSN>1661-4917</JournalElectronicISSN>
			<JournalTitle>Archivum Immunologiae</JournalTitle>
		    </JournalInfo>
		</Journal>
	</wrapper>
	<wrapper>
		<PublisherInfo>
			<PublisherName>Birkhäuser-Verlag</PublisherName>
		</PublisherInfo>
		<Journal OutputMedium="All">
		    <JournalInfo JournalProductType="ArchiveJournal">
			<JournalPrintISSN>0004-069X</JournalPrintISSN>
			<JournalElectronicISSN>1661-4917</JournalElectronicISSN>
			<JournalTitle>Archivum Immunologiae</JournalTitle>
		    </JournalInfo>
		</Journal>
	</wrapper>
</Publisher>
			

Take Note: the output file, “merged.xml”, is into the directory X_OUTPUT. Moreover, it loses any DTD declarations.