本發(fā)明屬于微生物基因資源和基因工程領(lǐng)域,具體涉及核苷類抗生素2’-氯代噴司他丁(2’-chloropentostatin,2’-Cl PTN)和2’-氨基-2’-脫氧腺苷(2’-amino-2’-deoxyadenosine,2’-amino dA)的生物合成基因簇的克隆,序列分析,基因體內(nèi)功能驗(yàn)證,體外生化研究及其應(yīng)用。
背景技術(shù):
1979年,美國科學(xué)家首次從馬杜拉放線菌ATCC 39365(Actinomadura sp.ATCC 39365)的發(fā)酵液中分離出嘌呤類核苷抗生素2’-氨基-2’-脫氧腺苷(JAntibiot(Tokyo),1979,32,1367-1369;Arch Biochem Biophys,1989,270,374-382)。2’-氨基-2’-脫氧腺苷具有廣泛的抗RNA病毒活性,并已經(jīng)應(yīng)用于支原體病毒、麻疹等RNA病毒的治療中(J Antibiot(Tokyo),1979,32,1367-1369;Agr Biol Chem Tokyo,1985,49,2711-2717)。由于2’-氨基-2’-脫氧腺苷極易于被細(xì)胞中廣泛存在的腺苷脫氨酶(adenosine deaminase,ADA)作用催化形成2’-氨基-2’-脫氧肌苷(2’-amino dI),前期研究者同樣從馬杜拉放線菌ATCC 39365發(fā)酵液中分離得到了腺苷脫氨酶的強(qiáng)有效抑制劑2’-氯代噴司他丁(2’-chloropentostatin,2’-Cl PTN)。同樣作為嘌呤類核苷抗生素,2’-氯代噴司他丁在臨床上應(yīng)用于血液型癌癥的治療(J Antibiot(Tokyo),1985,38,1344-1349)。
從化學(xué)結(jié)構(gòu)上來看,2’-氨基-2’-脫氧腺苷作為腺苷的結(jié)構(gòu)類似物,與腺苷在結(jié)構(gòu)上的區(qū)別是C-2’位上的羥基發(fā)生轉(zhuǎn)氨生成氨基;2’-氯代噴司他丁則含有一個特殊的1,3-二氮雜七元環(huán)以及在C-2’發(fā)生氯代。在前期報道中,通過同位素喂養(yǎng)實(shí)驗(yàn),表明2’-氨基-2’-脫氧腺苷的生物合成途徑中涉及到的C-2’的氨基是由于C-2’的羥基發(fā)生轉(zhuǎn)氨基而形成的(Arch Biochem Biophys,1989,270,374-382)。同樣通過同位素標(biāo)記實(shí)驗(yàn),確定了2’-氯代噴司他丁的生物合成是以腺苷為前體,通過D-核糖的C-1插入嘌呤環(huán)的C-6位和N-1位形成了特殊的1,3-二氮雜七元環(huán)(Biochemistry,1984,23,904-907;Biochemistry,1987,26,5636-5641);此外,2’-氯代噴司他丁生物合成的起始階段與L-組氨酸的生物合成途徑以及噴司他丁的生物合成途徑具有一定的相似性(Arch Biochem Biophys,1989,270,374-382;Cell Chem Biol,2017,24,DOI:10.1016/j.chembiol.2016.12.012)。盡管2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷在臨床上已得到廣泛應(yīng)用以及在化學(xué)合成方面也有很好的研究進(jìn)展,但是在過去數(shù)十年間,關(guān)于這兩個嘌呤類核苷抗生素的生物合成仍然知之甚少。
技術(shù)實(shí)現(xiàn)要素:
為了克服現(xiàn)有技術(shù)中存在的不足,本發(fā)明以一種馬杜拉放線菌(Actinomadura sp.ATCC 39365)產(chǎn)生的治療白血病的天然產(chǎn)物嘌呤類核苷抗生素2’-氯代噴司他丁和具有抗RNA病毒的天然產(chǎn)物嘌呤類核苷抗生素2’-氨基-2’-脫氧腺苷為目標(biāo)分子,對其生物合成基因簇進(jìn)行克隆,通過序列分析、功能驗(yàn)證、體外生化實(shí)驗(yàn),揭示了一個基因簇包含兩個獨(dú)立的生物合成途徑的現(xiàn)象。本發(fā)明中整個基因簇是包含13個基因的核苷酸序列為SEQ ID NO:1中第10898-25249位所示,其中負(fù)責(zé)2’-氯代噴司他丁生物合成的基因,即adaC,adaB,adaA,adaK,adaL共5個基因;負(fù)責(zé)2’-氨基-2’-脫氧腺苷生物合成的基因,即adaF,adaG,adaJ,adaM共4個基因;負(fù)責(zé)2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷轉(zhuǎn)運(yùn)和調(diào)控的基因adaE,adaD,adaH,adaI共4個基因。
本發(fā)明還提供了一個編碼陽離子轉(zhuǎn)運(yùn)蛋白的核苷酸序列,其編碼的氨基酸序列為SEQ ID NO.2,命名為adaE,其核苷酸序列位于SEQ ID NO:1中第10898-12337堿基處。
本發(fā)明還提供了一個編碼MFS轉(zhuǎn)運(yùn)蛋白的核苷酸序列,其編碼的氨基酸序列為SEQ ID NO.3,命名為adaD,其核苷酸序列位于SEQ ID NO:1中第12392-13600堿基處。
本發(fā)明還提供了一個編碼ATP磷酸核糖轉(zhuǎn)移酶的核苷酸序列,其編碼的氨基酸序列為SEQ ID NO.4,命名為adaC,其核苷酸序列位于SEQ ID NO:1中第13672-14559堿基處。
本發(fā)明還提供了一個編碼短鏈脫氫酶的核苷酸序列,其編碼的氨基酸序列為SEQ ID NO.5,命名為adaB,其核苷酸序列位于SEQ ID NO:1中第14604-15308堿基處。
本發(fā)明還提供了一個編碼SAICAR合成酶的核苷酸序列,其編碼的氨基酸序列為SEQ ID NO.6,命名為adaA,其核苷酸序列位于SEQ ID NO:1中第15295-16014堿基處。
本發(fā)明還提供了一個編碼轉(zhuǎn)氨酶的核苷酸序列,其編碼的氨基酸序列為SEQID NO.7,命名為adaF,其核苷酸序列位于SEQ ID NO:1中第16239-17516堿基處。
本發(fā)明還提供了一個編碼脫氫酶的核苷酸序列,其編碼的氨基酸序列為SEQID NO.8,命名為adaG,其核苷酸序列位于SEQ ID NO:1中第17509-18564堿基處。
本發(fā)明還提供了一個編碼ABC轉(zhuǎn)運(yùn)蛋白的核苷酸序列,其編碼的氨基酸序列為SEQ ID NO.9,命名為adaH,其核苷酸序列位于SEQ ID NO:1中第18593-20380堿基處。
本發(fā)明還提供了一個編碼ABC轉(zhuǎn)運(yùn)蛋白的核苷酸序列,其編碼的氨基酸序列為SEQ ID NO.10,命名為adaI,其核苷酸序列位于SEQ ID NO:1中第20402-22180堿基處。
本發(fā)明還提供了一個編碼NUDIX水解酶的核苷酸序列,其編碼的氨基酸序列為SEQ ID NO.11,命名為adaJ,其核苷酸序列位于SEQ ID NO:1中第22177-22662堿基處。
本發(fā)明還提供了一個編碼磷酸核糖異構(gòu)酶A的核苷酸序列,其編碼的氨基酸序列為SEQ ID NO.12,命名為adaK,其核苷酸序列位于SEQ ID NO:1中第22659-23396堿基處。
本發(fā)明還提供了一個編碼ATP磷酸核糖轉(zhuǎn)移酶的核苷酸序列,其編碼的氨基酸序列為SEQ ID NO.13,命名為adaL,其核苷酸序列位于SEQ ID NO:1中第23580-24446堿基處。
本發(fā)明還提供了一個編碼磷酸水解酶的核苷酸序列,其編碼的氨基酸序列為SEQ ID NO.14,命名為adaM,其核苷酸序列位于SEQ ID NO:1中第24455-25249堿基處。
從克隆生物合成基因簇出發(fā),采用微生物學(xué)、分子生物學(xué)、生物化學(xué)及有機(jī)化學(xué)相結(jié)合的方法研究其生物合成,通過對其生物合成機(jī)制的研究揭示了一個基因簇包含兩個獨(dú)立的生物合成途徑的現(xiàn)象,在這種獨(dú)特的生物合成途徑中蘊(yùn)含了一個保護(hù)與被保護(hù)的機(jī)制,以及包括2’-氯代噴司他丁在內(nèi)的獨(dú)特化學(xué)結(jié)構(gòu)形成的酶學(xué)機(jī)制。
在此基礎(chǔ)上運(yùn)用代謝工程的原理,通過組合生物學(xué)對生物合成途徑的合理修飾,探索結(jié)構(gòu)穩(wěn)定、活性更好、并能通過微生物大量發(fā)酵產(chǎn)生的新型藥物。
本發(fā)明的2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成基因簇的應(yīng)用,包括(但不限于):
(1)本發(fā)明還提供了產(chǎn)生2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成基因簇中斷的微生物體的途徑,至少其中之一的基因包含有SEQ ID NO.1中的核苷酸序列。
(2)包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列的克隆DNA可用于從馬杜拉放線菌ATCC 39365(Actinomadura sp.ATCC 39365)基因組文庫中定位更多的文庫質(zhì)粒。這些文庫質(zhì)粒至少包含本發(fā)明中的部分序列。
(3)包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列可以被修飾或突變。這些途徑包括插入、置換或缺失,聚合酶鏈?zhǔn)椒磻?yīng),錯誤介導(dǎo)聚合酶鏈?zhǔn)椒磻?yīng),位點(diǎn)特異性突變,不同序列的重新連接,序列的不同部分或與其他來源的同源序列進(jìn)行定向進(jìn)化,或通過紫外線或化學(xué)試劑誘變等。
(4)包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列的克隆基因可以通過合適的表達(dá)體系在外援宿主中表達(dá)以得到相應(yīng)的酶或其他更高的生物活性或產(chǎn)量。這些外源宿主包括鏈霉菌、假單孢菌、大腸桿菌、芽孢桿菌、酵母、植物和動物等。
(5)本發(fā)明所提供的氨基酸序列可以用來分離所需要的蛋白并可用于抗體的制備。
(6)包含本發(fā)明所提供的氨基酸序列或至少部分序列的多肽可能在去除或替代某些氨基酸之后仍有生物活性甚至有新的生物學(xué)活性,或者提高了產(chǎn)量或優(yōu)化了蛋白動力學(xué)特征或其他致力于得到的性質(zhì)。
(7)包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列的基因或基因簇可以在異源宿主中表達(dá)并通過DNA芯片技術(shù)了解它們在宿主代謝鏈中的功能。
(8)包含本發(fā)明所提供的核苷酸序列或至少部分核苷酸序列的基因或基因簇可以通過遺傳重組來構(gòu)建重組質(zhì)粒以獲得新型生物合成途徑,也可以通過插入、置換、缺失或失活進(jìn)而獲得新型生物合成途徑。
(9)包含本發(fā)明所提供的核苷酸序列編碼的蛋白可以催化ATP生成AMP,并可以通過與其他天然產(chǎn)物的生物合成途徑或部分生物合成途徑重組,來獲得新的嘌呤核苷類化合物。
(10)包含本發(fā)明所提供的核苷酸序列編碼的蛋白可以催化腺苷及其結(jié)構(gòu)類似物合成肌苷及其結(jié)構(gòu)類似物,并可以通過與其他天然產(chǎn)物的生物合成途徑或部分生物合成途徑重組,來獲得新的嘌呤核苷類化合物。
因此,含有上述基因簇的重組載體、表達(dá)盒、轉(zhuǎn)基因細(xì)胞系或重組菌也是本發(fā)明保護(hù)的范圍。
上述的蛋白質(zhì)、上述的基因簇、上述的重組載體、表達(dá)盒、轉(zhuǎn)基因細(xì)胞系或重組菌在合成2’-氯代噴司他丁和/或2’-氨基-2’-脫氧腺苷中的應(yīng)用也是本發(fā)明保護(hù)的范圍。
本發(fā)明的另一個目的是提供一種合成2’-氯代噴司他丁或2’-氨基-2’-脫氧腺苷的方法。
本發(fā)明提供的方法,為發(fā)酵上述的重組菌,收集發(fā)酵產(chǎn)物,即得2’-氯代噴司他丁和/或2’-氨基-2’-脫氧腺苷。
總之,本發(fā)明所提供的包含2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成相關(guān)的所有基因和蛋白信息可以幫助人們理解嘌呤核苷類抗生素的生物合成機(jī)制,為進(jìn)一步遺傳改造提供了材料和知識。本發(fā)明所提供的基因及其蛋白也可以用來尋找和發(fā)現(xiàn)可用于醫(yī)藥、工業(yè)或農(nóng)業(yè)的化合物或基因、蛋白。
附圖說明
圖1:2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷的化學(xué)結(jié)構(gòu)。
圖2:A)2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成基因簇的基因結(jié)構(gòu)圖。B)2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成基因簇中斷示意圖和發(fā)酵產(chǎn)物的高分辨液相質(zhì)譜聯(lián)用儀分析(LC-MS),其中,WT-野生型,ST-標(biāo)準(zhǔn)品,LG1-突變株。C)2’-氯代噴司他丁一級質(zhì)譜數(shù)據(jù)。
圖3:2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成基因簇中斷示意圖及發(fā)酵產(chǎn)物的高分辨液相質(zhì)譜聯(lián)用儀(LC-MS)分析。
圖4:2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成途徑推測
A)2’-氯代噴司他丁。B)2’-氨基-2’-脫氧腺苷。
圖5:焦磷酸水解酶功能分析
A)AdaJ蛋白SDS-PAGE分析。B)以ATP作為底物時不同2價金屬離子對AdaJ生化反應(yīng)活性影響示意圖。C)分別以ATP和AMP作為底物時AdaJ生化反應(yīng)的HPLC分析。D)以鈷離子為2價金屬離子時AdaJ催化不同底物活性示意圖。
具體實(shí)施方式
通過以下詳細(xì)說明結(jié)合附圖可以進(jìn)一步理解本發(fā)明的特點(diǎn)和優(yōu)點(diǎn)。所提供的實(shí)施例僅是對本發(fā)明方法的說明,而不以任何方式限制本發(fā)明揭示的其余內(nèi)容。
下列實(shí)施例中所使用的實(shí)驗(yàn)方法如無特殊說明,均為常規(guī)方法。
下述實(shí)施例中所使用的材料、試劑等,如無特殊說明,均可從商業(yè)途徑得到。1.克隆、分析2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷的生物合成基因簇:
我們首先提取出了馬杜拉放線菌ATCC 39365的總DNA,利用測序技術(shù)對其總DNA進(jìn)行全基因組掃描測序,并構(gòu)建了以pJTU2463b為載體的馬杜拉放線菌ATCC 39365總基因組文庫。根據(jù)1984年和1987年Hanvey等的報道,腺苷是噴司他丁的生物合成直接前體,通過D-核糖的C-1插入嘌呤環(huán)的C-6位和N-1位形成了特殊的1,3-二氮雜七元環(huán),以及根據(jù)2017年Chen等對噴司他丁生物合成途徑的研究。以抗生鏈霉菌NRRL 3238(S.antibioticus NRRL 3238)中編碼噴司他丁生物合成的基因penA,penB,penC作為探針,與馬杜拉放線菌ATCC 39365總DNA的測序結(jié)果進(jìn)行序列分析比對,找到與其同源的編碼蛋白。本發(fā)明利用一對引物39365cluster-idF:ACACCACCTCGATGCTCG,39365cluster-idR:ACCAGGTTCTCGGTCAGG對總基因組文庫進(jìn)行篩選,分離得到含penA,penB,penC同源的基因adaC,adaB,adaA的粘粒,再通過DNA序列和序列分析,確定了一個覆蓋染色體34.2kb DNA區(qū)域的黏粒,從而確定了2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷的生物合成基因簇在基因組上的大概位置。本發(fā)明進(jìn)一步對含有2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成基因簇的黏粒進(jìn)行序列分析,將得到的目標(biāo)基因簇片段克隆到載體pJTU2463上。中斷目標(biāo)基因簇片段得到的突變菌株不產(chǎn)生2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷。DNA測序分析了17kb的染色體區(qū)域,生物信息分析包含了15個開放讀碼框。詳細(xì)的分析結(jié)果列于表1.
表1:2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成基因簇中各基因及編碼蛋白的功能分析
a括號中提供NCBI登記號
b2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成基因簇之外的開放讀碼框
2. 2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷的生物合成基因簇邊界的確定:
對基因orf1進(jìn)行中斷不影響2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷的產(chǎn)生;并根據(jù)基因編碼蛋白的功能分析,2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷的生物合成基因簇確定為基因adaE至adaM,涵蓋染色體14.3kb的區(qū)域,包含13個開放讀碼框。整個2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷的生物合成基因簇共13個基因:5個與2’-氯代噴司他丁合成相關(guān)的基因;4個與2’-氨基-2’-脫氧腺苷合成相關(guān)的基因;4個與兩者轉(zhuǎn)運(yùn)調(diào)節(jié)相關(guān)的基因。如圖例2A所示。對上述基因簇在染色體水平進(jìn)行中斷,得到突變株LG1,對其發(fā)酵液進(jìn)行高分辨質(zhì)譜液相聯(lián)用儀(HRLC-MS)分析確定突變株不產(chǎn)生2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷。如圖例2所示。
3. 2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成相關(guān)基因的體內(nèi)功能確定:
在2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷基因簇上依次逐個敲除每個基因,并接合轉(zhuǎn)移到宿主菌CXR14中進(jìn)行異源表達(dá),宿主菌CXR14為產(chǎn)生多氧霉素工業(yè)菌株的大片段缺失的鏈霉菌,且其本身不產(chǎn)生2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷。突變菌株發(fā)酵液利用液相質(zhì)譜聯(lián)用儀(HRLC-MS)進(jìn)行檢測分析,確定即adaC,adaB,adaA,adaK,adaL共5個基因是與2’-氯代噴司他丁生物合成相關(guān)的基因;adaF,adaG,adaJ,adaM共4個基因是與2’-氨基-2’-脫氧腺苷生物合成相關(guān)的基因,adaE,adaD,adaH,adaI共4個基因作為轉(zhuǎn)運(yùn)和調(diào)節(jié)基因。如圖例3所示。
4. 2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷的生物合成
2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成基因簇中某些基因在體內(nèi)2’-氯代噴司他丁的生物合成中與L-組氨酸及噴司他丁的生物合成有一定的相關(guān)性,以dATP和PRPP為起始化合物,ATP磷酸核糖轉(zhuǎn)移酶AdaC,AdaL催化dATP形成磷酸核糖dATP(PR-dATP)(化合物1),再利用L-組氨酸生物合成途徑中的磷酸核糖AMP環(huán)化水解酶HisI、磷酸核糖ATP焦磷酸酶HisE、磷酸核糖異構(gòu)化酶AdaK三個蛋白合成化合物2,之后經(jīng)過SAICAR合成酶一系列的催化作用合成化合物5,再經(jīng)過一個去磷酸化酶AdaM的催化作用形成6-羥基噴司他丁(化合物6),最后有短鏈脫氫酶AdaB將6-羥基噴司他丁催化合成噴司他丁(化合物7),再在陽離子轉(zhuǎn)運(yùn)蛋白AdaE催化下在2’位發(fā)生氯代生成2’-氯代噴司他丁(化合物8)。如圖例4A所示。2’-氨基-2’-脫氧腺苷的生物合成是由腺苷三磷酸(ATP)作為起始化合物,通過NUDIX水解酶AdaJ和脫氫酶AdaG的共同催化和調(diào)節(jié)形成腺苷,腺苷再在氨基轉(zhuǎn)移酶AdaF的催化下2’位羥基發(fā)生轉(zhuǎn)氨形成2’-氨基-2’-脫氧腺苷。如圖例4B所示。
5. 2’-氨基-2’-脫氧腺苷生物合成相關(guān)基因adaJ編碼的NUDIX水解酶AdaJ的體外功能驗(yàn)證:
將adaJ基因利用PCR擴(kuò)增后克隆至表達(dá)載體,在大腸桿菌中異源過量表達(dá)、純化,通過SDS-PAGE分析后得到較純的蛋白AdaJ。通過生物信息學(xué)分析,發(fā)現(xiàn)AdaJ是焦磷酸水解酶。本發(fā)明以腺苷三磷酸(ATP)作為底物,AdaJ催化腺苷三磷酸(ATP)的體外生化反應(yīng)經(jīng)過質(zhì)譜檢測后,分析發(fā)現(xiàn)形成了腺苷單磷酸(AMP)。由此,可知AdaJ可催化腺苷三磷酸(ATP)形成腺苷單磷酸(AMP),并且這個反應(yīng)作為2’-氨基-2’-脫氧腺苷生物合成途徑的第一步,形成腺苷單磷酸(AMP)起始了2’-氨基-2’-脫氧腺苷的生物合成。隨后利用不同的2價金屬離子以及不同的反應(yīng)底物,對AdaJ生化反應(yīng)活性研究發(fā)現(xiàn),當(dāng)在反應(yīng)體系中加入2價鈷離子時AdaJ的反應(yīng)活性最大。并且除了ATP之外,AdaJ還能夠利用dATP、GTP作為反應(yīng)底物,但不能利用ADP。如圖例5所示。
【實(shí)施例1】2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷產(chǎn)生菌馬杜拉放線菌ATCC 39365總DNA的提?。?/p>
取30μL馬杜拉放線菌ATCC 39365孢子至50mL的TSB培養(yǎng)基中,30℃,200rpm,培養(yǎng)24-36h,至培養(yǎng)基呈現(xiàn)渾濁狀態(tài)。50mL的馬杜拉放線菌ATCC 39365菌液,4,000rpm,4℃,10min離心去上清,收集菌體。將菌體溶于25mL的10.3%蔗糖溶液中震蕩混勻洗滌菌體,4,000rpm,4℃,10min離心去上清;再將菌體溶于15mL的set buffer中振蕩混勻,4,000rpm,4℃,10min離心去上清,重復(fù)兩次;將菌體溶于10mL的set buffer中震蕩混勻后,加入50μL的溶菌酶溶液(100mg/mL)置于37℃水浴鍋中溫浴30min;隨后加入280μL蛋白酶K溶液(50mg/mL),混勻后加入600μL 10%SDS,顛倒混勻后置于55℃溫浴4h,此期間每隔15min顛倒混勻,每隔30min添加蛋白酶K溶液100μL,直至菌絲體裂解變透明;之后加入4mL 5M的NaCl,顛倒混勻,室溫下放置菌液至37℃左右;加入10mL的氯仿顛倒混勻至乳白色,4,000rpm,4℃,10min,取出上清液后加入0.6倍體積的異丙醇混勻;混勻后,有絮狀DNA析出,將析出的DNA小心挑出并用75%的乙醇洗滌兩次,置于通風(fēng)處吹干,溶于適量的超純水中。核酸電泳(40v,12h)檢測DNA的大小,并用Nanodrop 2000測定其濃度及OD值確定提出總DNA的純度。
【實(shí)施例2】2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷產(chǎn)生菌馬杜拉放線菌基因組文庫的建立。
首先通過一系列的稀釋實(shí)驗(yàn)來確定Sau 3AI的用量,配制總體積為500μL的酶切體系(5μL 0.1u/μL Sau 3AI,495μL 10Xbuffer稀釋過的DNA)溶液1,取300μl溶液1與100μL DNA混勻成溶液2(Sau 3AI終濃度0.075u/100μL),取200μL溶液1與200μL DNA混勻成溶液3(Sau 3AI終濃度0.05u/100μL),取200μL溶液2與200μL DNA混勻成溶液4(Sau 3AI終濃度0.0375u/100μL),取200μL溶液4與200μl DNA混勻成溶液5(Sau 3AI終濃度0.01875u/100μL)。上述溶液混勻都均置于冰上,隨后一起37℃水浴1h后取出立即置于冰上,用12cm長1%瓊脂糖凝膠電泳,以control DNA和λmix為maker,上樣量5μL,30v電壓,18h后凝膠成像儀下檢測酶切質(zhì)量。根據(jù)預(yù)酶切實(shí)驗(yàn)結(jié)果選擇合適的稀釋液進(jìn)行脈沖場電泳(脈沖場電泳條件:泵的溫度16℃,電泳時間16h,電壓6.0v,轉(zhuǎn)角120°,轉(zhuǎn)角時間1s,6s),將樣品上樣至預(yù)先準(zhǔn)備好的1%的低熔點(diǎn)瓊脂糖凝膠中,脈沖場電泳結(jié)束后在長波紫外光下檢測,回收48-kb左右大小的凝膠,溶于10xβ-瓊脂糖酶I反應(yīng)緩沖液,65℃溫浴待凝膠完全溶解,冷卻至42℃后加入β-瓊脂糖酶(按100μL體積加1μL酶)42℃溫浴1h,然后65℃溫浴15min對酶失活。12,000rmp,15min離心取上清,加入1/10體積的最新配制的3M的乙酸鈉和2倍體積的異丙醇,充分混勻室溫放置10min后12,000rpm,15min離心去除上清,75%乙醇洗滌兩次,室溫下干燥后加適量水溶解。電泳檢測回收片段質(zhì)量進(jìn)行去磷酸化處理,克隆至載體pJTU2463b。
pJTU2463b載體的處理:提取質(zhì)粒凝膠電泳檢測后用HpaI單酶切,去磷酸化(防止自連),再用BamHI酶切,得到7-kb和2-kb的DNA片段,凝膠回收7-kb的片段。
總DNA片段和載體pJTU2463b的酶連:將濃度為15ng/μL的處理好的載體與濃度為46.2ng/μL回收到的去磷酸化總DNA片段以1:3的比例進(jìn)行酶連,酶連體系:載體4μL,DNA片段13μL,T4buffer 2μL,T4連接酶(NEB)1μL,超純水補(bǔ)到20μL體系。16℃溫浴12h后,70℃溫浴10min酶失活。對酶連產(chǎn)物進(jìn)行凝膠電泳檢測后,進(jìn)行后續(xù)步驟。
文庫包裝:從-80℃冰箱內(nèi)取出25μL包裝蛋白與10μL酶連產(chǎn)物混勻,30℃溫浴90min后再加入25μL包裝蛋白,繼續(xù)30℃溫浴90min,加入PDB稀釋至1mL,加入25μL氯仿。
EPI300感受態(tài)的制備:在LB固體平板上劃單菌落,挑單菌落至5mL LB中37℃過夜培養(yǎng),菌液按1%轉(zhuǎn)接至50mL LB中,并加入500μL的1M MgSO437℃培養(yǎng)至OD600=0.85。
轉(zhuǎn)染:取上述包裝產(chǎn)物10μL加PDB稀釋到1mL,取稀釋的包裝產(chǎn)物10μL與100μL制備好的EPI300感受態(tài)混勻,37℃溫浴20min,涂布到阿泊拉LA培養(yǎng)皿上,37℃培養(yǎng)12h后挑單克隆于裝有LB培養(yǎng)基的96孔板中,37℃培養(yǎng)24h后,加入等體積40%甘油保存在-80℃。
【實(shí)施例3】2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷產(chǎn)生菌馬杜拉放線菌ATCC 39365發(fā)酵條件,產(chǎn)物高效液相(HPLC)檢測條件。
將馬杜拉放線菌ATCC 39365的孢子接種于種子培養(yǎng)基中,30℃,220rmp培養(yǎng)48h,按照4%的接種量轉(zhuǎn)接至發(fā)酵搖瓶中,32℃,220rmp培養(yǎng)6天。收集發(fā)酵液,將發(fā)酵液9,000rpm離心20min,取上清后用等體積的正丁醇反復(fù)萃取3次,所得萃取液經(jīng)旋轉(zhuǎn)蒸發(fā)儀除去溶劑正丁醇后溶于1.5mL純凈水,12,000rpm高速離心后取上清,過0.22μm的微孔濾膜進(jìn)行HPLC的檢測分析。
HPLC檢測條件:A相為加了0.15%三氟乙酸(TFA)的超純水,B相為甲醇。初始為95%的A相在30min內(nèi)梯度洗脫至80%,31min時A相轉(zhuǎn)換為10%并保持這一濃度持續(xù)洗脫至40min,在41min時A相轉(zhuǎn)換至95%,保持至60min。流速為0.5mL/min,檢測波長254nm,柱溫30℃。
【實(shí)施例4】2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷產(chǎn)生菌馬杜拉放線菌ATCC 39365及其異源表達(dá)接合轉(zhuǎn)移的方法。
將要接合轉(zhuǎn)移的目標(biāo)質(zhì)粒先轉(zhuǎn)化到大腸桿菌E.coli ET12567/pUZ8002感受態(tài)中,待長出轉(zhuǎn)化子后驗(yàn)證,將陽性單克隆接種于5mL的LB培養(yǎng)液中,37℃過夜培養(yǎng),將菌液按10%接種于5mL LB培養(yǎng)液中37℃培養(yǎng)3-5h。取宿主鏈霉菌孢子5,000rpm離心3min,去上清后超純水洗滌兩次,5,000rpm離心3min去上清。加入700μL的TES,混勻后根據(jù)受體鏈霉菌的不同以不同的溫度(45℃或50℃)熱擊5min或者10min,再加入700μL 2x孢子預(yù)萌發(fā)液,30℃培養(yǎng)3-5h。
將培養(yǎng)好的大腸桿菌于4℃,4,000rpm離心3min,去上清加入20mL LB洗滌兩次后離心去上清,加入1mL LB培養(yǎng)基混勻大腸桿菌細(xì)胞;將培養(yǎng)好的鏈霉菌的受體孢子5,000rpm離心3min,去上清后用20mL LB洗滌兩次,將上述處理好的大腸桿菌與處理好的孢子混勻,涂布于MS培養(yǎng)皿上,24h后加入1mL含適量用于篩選的抗生素的超純水來覆蓋,置于30℃培養(yǎng)數(shù)天至接合子長出。
【實(shí)施例5】含2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成基因簇的黏粒上PCR-targeting的方法
將含有2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成基因簇的黏粒3G12 轉(zhuǎn)化到大腸桿菌E.coli BW25113/pIJ790感受態(tài)細(xì)胞中,30℃培養(yǎng)過夜,挑單菌落于5mL LB培養(yǎng)基中(含阿泊拉和氯霉素),30℃培養(yǎng)過夜,菌液按1%轉(zhuǎn)接到50mLLB中并同時加入濃度為1M的L-阿拉伯糖,30℃培養(yǎng)3h至OD600為0.4-0.6。收集菌體于4℃,4,000rpm離心5min,10%甘油洗3次后加200μL 10%甘油混勻菌體,以50μL每管分裝待用。取5μL處理好的片段與感受態(tài)細(xì)胞混勻,加入到電轉(zhuǎn)杯中電轉(zhuǎn)(電轉(zhuǎn)條件:200Ω,25μF,2.5kV)后,37℃預(yù)培養(yǎng)30min,涂布于LA培養(yǎng)皿上,37℃培養(yǎng)8h,待長出轉(zhuǎn)化子后進(jìn)行后續(xù)驗(yàn)證。
【實(shí)施例6】含2’-氯代噴司他丁和2’-氨基-2’-脫氧腺苷生物合成相關(guān)基因編碼的蛋白重組、超量表達(dá)、分離純化
目標(biāo)基因進(jìn)行PCR擴(kuò)增,DNA測序驗(yàn)證正確后克隆到表達(dá)載體上,表達(dá)質(zhì)粒轉(zhuǎn)化到大腸桿菌E.coli BL21(DE)/pLysE中,挑取陽性單克隆于5mL LB培養(yǎng)基中37℃過夜培養(yǎng),按1%轉(zhuǎn)接于500mL LB中,37℃培養(yǎng)到菌體OD600至0.5-0.8,加入IPTG(終濃度0.1-0.2mM)誘導(dǎo),18℃培養(yǎng)20h,6,000rmp離心15min收集菌體。
向上述收集到的菌體中加入適量(20mL-30mL)的裂解buffer,震蕩混菌后用超聲破碎儀超聲破碎大腸桿菌細(xì)胞,4℃ 12,000rpm離心20min取上清。在4℃條件下,將上清液裝入有鎳填料的重力柱中,用含有不同濃度(20mM-200mM)咪唑的Tris buffer洗脫,對不同濃度下洗脫的樣品進(jìn)行SDS-PAGE分析,收集較純的蛋白樣品。
SEQUENCE LISTING
<110> 武漢大學(xué)
<120> 2'-氯代噴司他丁和2'-氨基-2'-脫氧腺苷生物合成基因簇及其應(yīng)用
<160> 1
<170> PatentIn version 3.3
<210> 1
<211> 36432
<212> DNA
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
tggtcggcgt cccggtcgtc cccgtcgtgt gcgcgagcac gaccgggcgg gcgcggtcgg 60
agacgaaggc ggcgggcatg ccgcgcaggt cggccttgga ggtgggcggg atcgtggaca 120
gcgtcgcggg cgtgatcgag ccggggtcca cgccgttcgt cccgaaccag cgccggtagt 180
aggggctctc ggcggccgcg tggcggaccg tgcgccgcag ccgccgccgg gtcaggtcct 240
cctggagctc ggggtcgctg cccgcgagca tgtcgtcggc gccctcgccc ggggacccga 300
acgtctccag ggtcacgcgc atgtcgtcgg ccagcgcgtt cacgtcccgc aggcggaacc 360
tgcggcccga catcatcgac caggcataac cgagctggcg taccgcggtg ccgaacacgc 420
gtcgtccttc cactcgtggg gggtggcccc ccgggagacc cggtcccccg ggggttcacg 480
cgagggtgat cagccgttca gcacgaagat ggcggtggac agggcgctga cggcgtactt 540
ggcgtaacgg cgagcggtct tgaccatgag gatctccctg gttgctcggg gcgcacgggt 600
ggcgcccgct gagatccaag gatgtgcggg ggcgcaagag agccactaca gggtcactac 660
aagggggcct cgcctcacac cagcgtcagg tagtgggcca gctcggcggc ggagtccgtg 720
gcgatcccga cgtggccgtc cgggcgcacg agcgtgtgcc gcaccccgcc gtacggctcg 780
gccggcgcgt acgcccgcac ctgcgggccg cccggcacgc ggggcgcggg atccggcgac 840
agcagcgtga agtgcgggcc gcgcagcagg tcgaacaggc gcgcccggga gccgtcggcc 900
agccgcagcg gcccgtcggg cacccgttcg ccgcccgcgc gggccagcga gccgccccgg 960
tagccgaccc ggagctggta ggtctcctcg tcccgccggt acgacttccg gtcgtgcagg 1020
tccgtgctga ggccgagcac gcgggcggcg atcgggcggc gctcctcctc gtaggtgtcc 1080
agcagcgacg ccggcccgcc ggtcagcgcg cgggcgagct tccagcccag gttgtaggcg 1140
tcctggacgc cggtgttgag cccctggccg cccgccgggg agtgcacgtg ggcggcgtcc 1200
ccggccagca gcacgcgccc ggcgcggtag gcgtcggcca gccggatgtt gacccggtag 1260
agcgaggacc acagcagctc ggtgacgcgg gccgcgcgca gcatcgagcg ttcggcgaag 1320
agccgctgga agccggccag cgtcaggtcc ggcgtcgcgc ccggcctgag cggcgccatg 1380
aactggaagg cgtcggtgcc ggccatgggg tagagcgcga ccatccccct gcccggccgc 1440
acccaggcgt ggatgtgctc gcggtccagc ccctcgacgc gcacgtcccc gagggccatg 1500
cgctcctcct cgcgggtctc gccggtgaac ccgatgccga gcgccttgcg caccgtgctc 1560
ctgccgccgt cggcggcgac caggtagcgg gcccgcaggg tcgccccggt gccgagccgg 1620
gccgtcacgc cgtgcgcgtc ctgctcgaag ccggtcagct cgacgccgtg ctcgacggcg 1680
ccgcccagct cggccagccg ttcccgcagc accgcctcgg tgcgccactg cgggtgcatc 1740
cgcaccatcg ggtacggcac ctgctcgctc gccgccgccg gccggtgcat gcggccctgg 1800
tagaccggca gccgcccgat cctgacccgg atcggcgggt agggcgtcgt cccggccagg 1860
acccggtcga cgatcccgag gtcgtcgaag atctccagcg tccgcggctg gaggcccttg 1920
cccttggagc ccgccgaggg cgccgccgcc cgctcgacga cgcggtgggc gatcccgcgc 1980
ctggccagct ccacggcgag ggtgaggccg gtgggaccgg cgccgacgat gagcacgtcc 2040
gtcttgtccg tcatggggcg accgtgccag gcgccgctgc cagatctctg tcagcgccct 2100
gccggaggtc acgagcaggg cggacggtag ggcaggtcgt cgaagtagcg ccgccactcg 2160
gcccgggtga gcggcatccc ggcggcctcg cagacgcgcg cggccaccga ctccggccgc 2220
aggtcccaca accgcaggtc gtggccggcg gccgagacga gggtgtggtc gtcggcgtac 2280
aggacggtcc ggaggtcggg gtgggacagc tcggcgatcg ggcggcggga gcgcaggtcc 2340
cacaggcggg tggtgcggga ggcgtccgtg gtggccaggg cggcgccgtc ggggctgaac 2400
ccggcgatca ggacggtgcc gccgtgcccg cgcagggccg ccggacgggg ggcgcggccc 2460
gggtcggcga tcctccagac gcgggcggtc cggtcgtggc cggtgctgac cagcacgtcg 2520
cccttcgggc cgaagcgcac gccggccacc gcgtcggcgt gctcctcgtg gtgggcgacc 2580
tggtcgaggg cgcggcggcg cacgtcccac agccggacgc cgcggtcggc gccgccgctg 2640
gccagcaggc cgccgccggg gctgaagtcg agcgcgccga cgtcgccgaa gccgtcgcgg 2700
gcggccaggc ggcgcggcgg gccggtgacg tcccagagca gcagccggcc gccgtggtcg 2760
ccggtggcca ggacgcgtcc gtcggggctg agcgcggcga ccgcgaggcc gcgcaccggc 2820
agccggacgt gccgggccgg gcggtagggg tcgcggatgt cccacacggt cacggccacg 2880
ccgcgctcga cggtggccag ccggcggccg tcgccgctca gcgccggatc ggcggccggt 2940
ggaacgttcc agcgcgacag cagcacgcgg tggcgggggt cgcggacgtc ccacagcgcg 3000
ccgccgcgcg cgccggtcgt ggccatcagc gtccggtccc ggctcagcac cagcccggtc 3060
accgggtcgg gcaggcgcac cggatggcgc cactggtaga aggcgccggc cgccgacccg 3120
gtggccagcg cctgaccgtc ggcggtgaag gcgaggccga gcacgcggtc ggggtgcggg 3180
aacgaggtga cctccgcgcg ggtggccacc gtccacagcc ggaccgcgcg gtcgtcgccg 3240
gcggcggcca ggacggtgcc gtccgggctg aacgccaggg cgccggcggg gtcgccgtac 3300
cggacgggcc ggtcgcagcg gcgggcgcgc aggaggcaga gccaggtgcc gcggccgggc 3360
gccgaccagg ccagcgcgcc gccgtcgggg ctgatcgcga gcgcggcggg gccgtcctgg 3420
aggccggtcg cgatcgtgtc cgcgggggag gcgggccggg ccgggtccca caggcgcacg 3480
ctgccgtcgt cgtggccggt gaccaccgtg tcgtcgccga cggcgaagcg ggtgacctgg 3540
gcggaggcgc gtaccggagg cagggcgacc gggcggccgg cgagcctcca gcgccggacg 3600
ccgtgcgcgg cggcggcgag caccgtgcgg ccgtccggtg tgaacgccgc cgtcgccgcg 3660
ccgggcaggg cggcgacgtg ccgggggcgg gcggggtcaa cgacgttcca cacctcgacg 3720
ccgtgccggg tggccgcgag caggacgcgg tcaccgccgg agggccggaa ggcgagcgcg 3780
ccgaccggac cggcggcgcg cggcagggtg gccagcgggg tgcggcggcg cgagtcggtg 3840
gcgtcccaca gctcgaccac gccgttgtcg gcgccggtcg cgagcagggc gtggtcctgg 3900
ctgtaggtga tcgcccccag ccggtcgccg gccgagcgca ggcgggtggc gtacggctcg 3960
gcggtgtcga gcagcgtgcc gcgggtgccc ggggtgtcgg ccaggccctg ggcggccacc 4020
gccacctgca tcgccagcgc cgggtccgtg gcccgcagcg cccgcgcctg gacggcggcc 4080
cgctcggcga gcgcggtgac gcgggcggcg ctcgcccggc cgctcttctg ccacgccagc 4140
cccgcgcccg ccccggcgac caccaccagc accaccagcg ccaccaccag cccgcgcacc 4200
ctccgcccga ccggacacgc cctggcgccc ccgccggagc cggaggcggg agccgcgggc 4260
cctccgccag aggaggaagc cgccgactcg ggatcgggag aggcgggaac gggatcctcc 4320
cggtcggagc ggggaagggc gggtctctcc cggtcgggag agggagaagc gggcgcgggg 4380
ccgggggcga gaggcggcgc ggtgacgtgc cggtaggcgg cgtgggtgag cgtgtcgccg 4440
tcgcggcggt gccaggtggc gcgcatcgcg tgggacaggc gcggcagcgc gtcggccgcc 4500
gccggggcgt cgtcgcgtac gcccaggtcg tgcaggatcg tctcggccag ctccggctcc 4560
agggacagcc ccgcgacccg ggccggttcg gtgatcgcgc ggcgcagctc gtcgccggtc 4620
atcgggccga gcggcagcga ccgcgtccgc agcgcctcga ccagtggcgg aaaccccagg 4680
caccgctcgt agcagtcggc ccgcatcccg atcaccaccg gcagcccgcc cgccgcgagc 4740
gcgcacagct cctcgacgta cgcctcccgc ccggaccccg cccgagccgt gaacagctcc 4800
tcgaaccggt cgaccaccag cagcgcgcac tccgccaccg gccgcccccc gcccgggacc 4860
ggcgagccgg gcgtacgcca caacaccccg gcacggcccc ccgccggcga acccgccagg 4920
acggccggca ccagccccgc ggccagcacc gacgtcttgc ccgaccccgc cgccccggcc 4980
agcaccacgg gacgggcggc gcgcggctcc ccgagcagcc ccaccagctc ggacacgacc 5040
cgctcgcgcc cgaagaacca cctggcgtcc ccggtcgtga acggccccgg cggccagggg 5100
cacggatccc gcgcgcaccg gtcccgctcg ctccccagcc ccgcgccgaa cccgtcgtcc 5160
atcccgctcc cgtcactttg tgaagttgat cggggactca gggtacgcgg gcccgccagg 5220
cggcgtacgc cggtcgggaa ggcggcatcg cccggaccgg cgcgaagccg gggacggtca 5280
gcgcttgcgg ccgacgcccg cgcagtagac ggcctcgcgg tcggcgtaca gcgtctcggg 5340
gacgggacgc cagcgcggca gcggcaccac gcccggctcc agcagctcca gcccgtcgaa 5400
gaagctcgtg acctcctcaa gggtgcgcag cgtcatgggc gccccgccgc tctcgttcca 5460
gtggcggacg ccctcttcga gctgcctgcc gaccacgccg tgggagacgg ccaggtggct 5520
gcccggcgcc atcggctcca gcagctcggc gatgatcgag cgggccaggc cggtgtcggg 5580
aaggaagtcc acgacgccga gcagggtcag cccgatcggc cggtcgaagt cgaggtgctc 5640
ggcggcggcg gccaggatct tgccggggtc gcgcaggtcg gcgtcgacga aggcggtcgt 5700
ccccgcgggc gcggcccgca gcagcgcccg cgcgtgggcc aggaccatgg ggtcgttgtc 5760
gacgtacacg acgtggctgt ccggggccac cgcctgggcg acctggtggg tgttgtccgc 5820
cgtggggatg ccggtgccga tgtccaggaa ctgccggatg cccgcctccc cggccatgta 5880
ccgcaccgcg cggccgagga actccctgtt cgtacgggcc agcaccggca ggtcgggcac 5940
ggcggcgagg atctgctcgc tcaccatccg gtcgatctcg tagttgtcct tgccgcccag 6000
ccacaggtcg tagatgcggg cggggctcgg aacatcggtg ttcagcgaac gttcgctcat 6060
cgcgacgcct tccggacgaa aggggctacg ggggagatcg tagccccaag atcccttcaa 6120
ggggactcag agcaggcggc gctggatcgg gtcggccatg gccgtggtga cgatccactc 6180
gacgacggcg gcgtcgggcg ggtcgtcctc gcggccggcg aacagcaggt gggcgctgcc 6240
gatcagggtg agcgccaggg cgtcgacgtc ggcgtcgacc gcgatgcggc ccatctcgcg 6300
ctcggcggcg aggtaggcgg cgatcatggt cgtggcctcg cccaggagcg ggatgcccgc 6360
cgacccggcc tggcgcaggc gggcgcgcag gccgtcgcgg gagatgacga ggctgacgac 6420
ggccacggcg accgagccga acagctccag cagcgcgagg gtgaggttcc ccgcgacggt 6480
gccggtgccg gcgctcgcgc gcagggcggc ggactgggcc tcgacgcggc ggatgcggtc 6540
ggtcaccagc tcggcgagga aggcgtcgaa gtcggcgaag tggcggtgga gcacgccctt 6600
ggcgcagccc gcctcggtgg tgaccgcccg gctggtcaga gcgttggccc cgtcgcggag 6660
caggacgcgc tcggcggcgt cgaagagctg ttcgcgcacg tcgcggaggg ctacgcctgt 6720
gggcaccggg ccggtctcct tccttcgcga atcctgttca gagtagacga gtgggcacgt 6780
gcccattaga gtgggcgcat gcccactcta cccccgagaa cgagccgcac ctgcaccggc 6840
ggatggccga gtccttcggc gccgacgccg accgctacga ccgcacccgc ccccgctacc 6900
ccgaggagat ggtgacccgg atcgccgggt cgagccccgg ccccgacgtg ctggacgtcg 6960
gcatcggcac cggcatcgcg gcccgccagt tccaggcggc cggctgccgg gtgtccggcg 7020
tggacgtgga cccgcgcatg gccgacctgg cccggcgcga cggcatcgag gtggacgtgt 7080
ccgccttcga gtcctgggac ccggccggcc ggacgttcga cgcggtcgtc gccggccaga 7140
cctggcactg gatcgacccg gtcgcgggcg cggccaaggc ggcccaggtg ctgcggcccg 7200
ggggcaggct ggcgctgttc tggaacgtct tccagacgcc ggacctcctg gccgacgcct 7260
tcgccgaggt ctaccgccgc gtgctgcccg acctgccggc gctcggccag gccggccccg 7320
ccgtgggcat gtacacgtcg ctgctcgaca agatcgccga cggggtccgg acggcgggcg 7380
cgttcggcga gcccgagcgg tgggactacg cgtgggagcg cgtctacacc cgggacgagt 7440
ggctggcgca ggtgcccacg cacggcccgg tcaccggatt gccgccggag cggcgccgcg 7500
aactgctggc gggcatcggg gccgcgatcg acgcggtggg cggcgccttc accgtgcact 7560
acgacgtcct ggtgctcacc gccgtccggc gggctgagta gccgttactc atgcggtcat 7620
gcgagcgcgc gcggacgatc ctcctcaggg cgacacccca ggaggaacca ccgtgagcgc 7680
cacacatgac atgagaaccg ggcggaaagg gccggcgcgg tggcgcgggc ggctcacgtt 7740
cgccgcgatc ggcgtgctgg tcgccgcctt gatcccggcg ctggcgacgc cggcgtcggc 7800
ggcggtcccc ggcttctcgc tgaccttctc catcgactat ctccgccaga tcgaggaccc 7860
ggacagcgga ggcgtcggcg acggcgacta ctaccccaag gtcaggatcg ccgacggccc 7920
gctcgagacc ggcccgcgca tcgaggacga cgagttcgag cccctcggcc tgcccgaggc 7980
cccgggcggc tgggtcttca ccaagtccga cctgcccggc gaccagccga ccgtgaacat 8040
caccgtcgcc ctctgggact atgacggcgg cctcaacacc aacgacgacc gcatggacat 8100
cagcccccag aacgacgacc tcgacctcga cctcgtctac gacgtgcgct cgggcaccct 8160
gtcgggcgac ggcgtgcagt tcggcacgcc gtgcgtcgac tcccaggggc gggagaaggg 8220
ccagacgtgc gccgagggcg acggcgacca cggcttcccc agggacaacg acggcaggaa 8280
gaccaggatc ggcttcaccg tctccaccgt gatgcccgac accgaccacg acggcatccc 8340
ggacctcgtc gagctgtccg ggatccgcga caggaacggc ggggtcatcg ccgacctccc 8400
ggcgctcggc gcggacccgt gccgcaagac gatcgtgctg caggccgact acatggccga 8460
cgccaggcac tcccaccggc ccaagccgga cgcgatcacc ctggtcagga acgccttcga 8520
cgccgcgccg gtcaaggcgg agaacccgtg cccgtacccc gggacgcacc gggacggcgt 8580
cgacttcgtc tacctccagg gcgcgcaact gcccgagcag gccgccatgg gcctggccga 8640
cgacagcgac ttccgcgacg cgcgcaccgc cgacttcccc gccgagctgg cccggtacgc 8700
gcactacgcg atcttcgccc acgacctcgt cgtcaccacc gcgaccggga gctcgacgtc 8760
gccgggcacg tcggggcagt gctgcgagcc caccagggac aacaacaagg acttcatcgt 8820
caccttcggc tcgtggcgga cgatgtgcgt ggccgacttc ggcgtggact acggcggcga 8880
cggccggctc cagaccaccc cggcgggcga cgacgtggtg atcggcaccc agatccacgt 8940
gggcccggac cgtacgtgcg acaccaccgg cgcggacccg accgaccggc aggtgctcac 9000
cgtcgggacc ggccgggacg acgccgaggt cggcaccgtc caggaccagg ccgggaccat 9060
catgcacgag ctcgggcacg ccctgggcct gcggcacggc ggcgacgtca acgaccagtg 9120
gaagcccaac tacctcagcg tcatgaacta cttcttccag tcgggcatcc ccaccggccc 9180
gccgccgctg ctggtcaacc agctcaggga ctggaagatc ggcgcgatcc gggtcggcta 9240
ctccagcggc ggcctgcccc cgctcaccga gggcacgctg aacgagtccg ccggcatcgg 9300
cgacggcacc gaccacacct tctggtggac cgacaacctc gccaacccca ccagcacgtt 9360
ccggccgctg cgctcggccg ccggcaacgc gccgatcaac tggaacgaca acaccgcccc 9420
cggcaccggc gccccgctca tcgacacggg gacggtgaac gtggacatca acgccagcgg 9480
cggcagcacg accctgcgcg accacgacga ctgggccgcc gtcaagtacc gggccgcgct 9540
ctcgccggac gccaagggct gggcctgcag cagcttcccc gtctccgcgc ccacgtgcag 9600
cggctcgggc agcggctcgg gcagcggctc gggcgccgag ccggagaagg agctcgactt 9660
caccaccgcc gtacggcagg agatgtcctt cttcaacctc tacgaccccg acatcgtcac 9720
cgccaagagc gtcgacaagc ctgacgccga acccggcgac accctcacct accaggtgaa 9780
gctggacaac gtcggcaccg gcccggcggc ctccgtcggc gtgaccgaca ccccgcccac 9840
cgggccggct cagacccgcc aggttcccta cctgggcgcc ggacggtcca cgaccgagac 9900
gttcacgtac acggtgccgt gcgacacggc cgacgcggcc gtcctgacca acacggccac 9960
cgccaccgcg acggacaccg ccgacggccc cgaggccgac acgggcgaca acatcgccaa 10020
ggccgccacc accgcgcacg cgccgcggct caccctgacc aagaccgccc cggccacggt 10080
gaacgccgga gaggcgatga ccgtcgggct gcgggcggcc aacgtcggca gcggcagggc 10140
caccgacgtc gtcctcaccg acacgctgcc caaggaggtc tactacagca gcgcgctcga 10200
ccgcggcagc ggcccccgtc ccggcacggt cacccgcaac ccggacggca ccaccaccct 10260
gacctggacg ctcggccccc tcgacggcgg cgcggacgtg tccgtcgggt tcacggcccg 10320
gcccggcctg ctgttcacga agggcgccga gctgcccgac acggcggccg ccgcgtaccg 10380
gaacgcgggc ggctgcgtct acgagcccgc cacggcgtcc gccacgacga cggtcaccga 10440
ggtcgccccc actcgcgacc cgcgctcgca cggctactgg aagacgcacc aggaggctcg 10500
caccgccgag ttcctggccc gcgtccaggc caccgaccag cggttcgaca ccgggcagga 10560
cggcgagctg gccgacgccg aggccgtcgc cgcgctctcg gccggcggcc cgcaacccgg 10620
tccggcccgc ttccagctcc tggccacgct gctcgacctg gcggcccggc aggtcaacgc 10680
gagcacccgg ctcgactcgc cgctcgtccg gcggctcggg gtgcgcacgg tcggcgaggc 10740
ggtcaggtac gccttcgcca ccctcgacct gccgcccggc ggcgcggcgg ccgggcgcta 10800
ctccgacacc acctcgatgc tcgacgagat cgtgaacaac cggagcgagg tctactgacc 10860
cccgccagga cgccgccatg agggcgccgc ggccggctca ggccggttgc ggcgtccgcg 10920
gctcggccct ggccggggcg tagaggccga ggcggcgcag ggccggcgcg gtcatcatgg 10980
tggtgatcag ggccatcagc acgagcaccg tgaacgtgcc gggcgagatc agccccatgc 11040
cgagcccggt gctgagcacg acgatctcgg tgacgccgcg ggcgttcatg agcacgccga 11100
gcccgagcgc gaggcgtccc ggcatgccgc ccgcccaggc gatcaccccg gccgcgccca 11160
gcttgccgag gacggcggcc accagcagca ccgccccgcc cagcagcacc accggatgac 11220
cgaacgccag atgcacgtcg gtgcgcagcc cgatcgaggc gaagaacacc ggcagcaaca 11280
tcgcgcggtt gagcgagccg aggcgttcgg gcaccgcgcc gagcacgggc gcgtcgcgag 11340
ggaaggcgac ccccgccagc agggcgccga agatcgcgtg cacgccgatg gcgtcggtcg 11400
ccgccgccag ggcgaagatc aggcccagga ccagcgcgag cgccgccgga gcgggcagcc 11460
ggcgggcggc gtgccgggcc gccagggcgg cgagcgcggg gcgtacgacg agggtcacgg 11520
cgaggaccag ggcggcggcc agccccagcg aggtcagcac gcccgccggg gaaccggcgt 11580
gggccagggc gatcaccgcg gccagcgcgc accaggcgag cacgtcggcc agccccgcgc 11640
acaggatcgc cagcgagccc agccgggtgg cggtcagccc ggcctcctgg aggatccggg 11700
ccagcacggg gaacgcggtg acgctcaacg ccgtcccggc gaacaggacg aacgccgtgc 11760
ccgacgccga cggcccggcg agcgcaccgg cgaacgggac cgccgcgacc gctcccaggg 11820
cgaacggcac ggccgtcatc gccagcgcgg cgacccccac gaccacgcgc tgccggccga 11880
gggcggccgg gtcgaactcc cgccccaccg tgaacatgaa gaccgccagc ccggcctggg 11940
ccagcagttc gagctgcggc gcgacccagc cgggcaggat cagcgcggac accccgggtg 12000
ccagcgagcc cagcgcggac ggcccgagca gcagcccggc caccatctgc cccaccacgg 12060
acggctggcc cgcccggcgc gccaggacgc cgcccagcga ggccagcccg ccgatgaccg 12120
ccacggcgag gaacagccgc aggatcgact gggcgggcgg caccccggcg tgcgcggccg 12180
gcgccgccgc ctgccccggc tgctgcgcgc ccggcccgaa caccagcagc atcgacacgc 12240
ccgccacggc gatcagcagc agcgcgaccg ccaggccccg ggccggggcg ggcagcaggg 12300
cctgttcggt gtcgtcgagc gggtcgtgtc cgcgcatggt cgcaccttcc tgcctactcc 12360
gcgatgggct cccggcccct cggcggccgg ctcagtcggc ggccgggagc ggccgggccg 12420
cccgccggcc gatcagcagc agggcgccga cggcggcgac gatgccgacg gcgatgacgg 12480
ggatcagggc ctggtagccg atctcgccgg ccaccaggcc gaacagcggg ggaccgagca 12540
gggagccgag gctgccggcc tgggcgatga cgccgttgcc gacgtcggcg tggtcgaggc 12600
gttccaggac gagggggagc gcggcgaaga ccagggcgcc caggaagccg ttctccatgg 12660
agatgacgcc cgcgccggtg agcgcgcccc cggcggagcc gccggcgtac aggatccagg 12720
cggccggcac gacgaggaag cccgacaggg ccagcacgcc gacccgtacg cccctgcgca 12780
gcaggacgcc caccgccagc ccgccggcca cgccgagcag cgagatcacc gaggtcagcg 12840
cgcccgcggc ggcggccgag cggccgtact gctcgatgag cagcgtgggc agcagcacga 12900
tcaccgggat ggtcaccagc gcggtcaggc agaaggccgc ggccagcacg cccggccagg 12960
tgagcgcccg cgcccggggc accggcccgg ccgcctgcgc cccgtcggcg cgcggcagcc 13020
gcgcccacgt caccgcggcg atcagcagcg tcagcccggc gatcaggccg gtccagccgc 13080
gccaccccag cgcgtcgccg gccgcgccgc cggccagcgt gctggccccc agcccgaccg 13140
ggacgaacgt cgcccagacc gacagggcgg tgcccctgtc gcgctcgtgg gccaggcgca 13200
ggatcagcgc cgggcaggag atggtgacca ggacgtagcc gacgccctcg acgccgcggg 13260
cgatgagcag ccaggcgaag tcgttcgccg ccatgctggc cgcgcccgcc gccgcgatca 13320
ggaccaggcc ggtgaccagc gaccgcccgg tgccgaagcg gcggacgaga taccccgccg 13380
gcaggccgac caccgccccg acgccgacca ccgccgagat cacccagccc agctccggca 13440
gcgacagccc gagctgggcc gccatcgcgg gcccgaccga ggagaacttg ccgaggctca 13500
tcgcgccgac gacccccccg acgtagacga ggaggatcgt cagccagggg gagcgtcgcg 13560
gggacgcctc cgtgatcttt gccgccgacc gggggtccat gagtctcctt cgcactgcgc 13620
tccgcgggcc agggttccca cgtccgatct tcgccatgcc ggcggcgccg atcaacccgc 13680
ttcccccgtg gcgaagatgc cgatgcccga ctcggtgacg ttgaccgccc cggccgacag 13740
cagtgcgtcg atcgtctcgg gcacctcccg cctgggcagg acgccctgga ggacgaccat 13800
ggtcccgtcc gcgaggtcgg cgcccgcgcg ccacgactgg cccggcatcg cgggcaccac 13860
ctcggccagc cggtcggcgg gggcacggac cgtcagcagc acgtgggccg ggcccaccca 13920
ggcggcgcgc agcagcaggg cgacgttcct gatcacgccg cgcctggccg ggtcggcgta 13980
ggcggccggg ttggcgacga gctgcgcgcc gcaggtccgg acggtctcga tgacgcgcag 14040
gtcgttctgg cgcagcgagt ggccggtctc cacgacgtcc atcatggcgt cggccagctc 14100
ggggatcttg gcctcggtcg cgccgtagga cacgaccacc tcggcccgcc gtcccgcctc 14160
gcggaagaag cggcggccga tgttcgggta ctcggtggcc accctgaccc cgtcgtcgag 14220
gtcggcggcg ctgcgggcgg ggtggccgga cggcaccgcg agcaccagcc gccacggccg 14280
ctcggtgttc ctggagtagt cgaacgactc caccacctcg acctcggccc cggtctcctc 14340
cacccagtcg gcgccggtca gcccgaggtc gaaggtgccg tccccgacga ccagggggat 14400
ctcgcgcggc ttgtagaagg ccacgtgcga cacgccggag tggtccaccg tgccgcggta 14460
ggagcggtcc gagcgccgcc gtacgggcag ccccgcggtc gcgaagagct ccagggcccg 14520
ccgctccagc gaacccttcg gcagggcgag cgagatcatg gtttcctcct gacgggcacg 14580
gcgggacggc gggggcgcga gcatcacacg tgcagcagcc ggtcgatgcc gacgacgggc 14640
ggctcgggca ggcaggacgg ccgcacctcg atctcgccca ccacgaggtc gtccggccag 14700
gagtcgacga tggtggccac gcagcgggcc accgaccggg tgctcatctt gtgcggcgag 14760
ccgtcgccgt cgaggttggc gatggcgccc ggcgagacca ggcaggcgcg caccccgttg 14820
cggcgctcct ccagcagcag cgtctcgacc agcgccttca gcccggcctt ggtggcgctg 14880
taggcggccc cgccctcgaa gtaccgcgtc ccggcgtggc tgcccatcgc gacgaacatc 14940
ccgccggcgg cccgcaacgc cggcaggacg gctttgagca ggtggaagca ggcgctcagg 15000
ttgatcgcga ccagccgctc ccagtcctcc gcgcgcagct cggcgatcgg gtcgagcacc 15060
cggtccaccg cgttggacac gcagacgtcg agccggccgg ccgagccgag cacctgggcg 15120
gtcgcctccg cgatctgcgc cgggtccgac aggtcgcacc tgtgctcgcc gagccagtcc 15180
tcacccgcga gcgagcggtt cagggagaag acgcggtagc cgcgctcgcg cagctcctcg 15240
gcgatggccc gcccgctgcc ccggttgctg ccggtgacca gcgcggccgg ggtctcatgt 15300
gatgacatcg aggacgcggc tccactggtt cacgatctcc tcgcccgcca cgccgtcgcg 15360
gaacaggtcc ttgtccagcg gggagccgtc ctggtggcgc aggcgcatgc agtcctggga 15420
gatctcggag atgacggtca gcccgccgtc ggaggtccgg ccgaagatga ggcagaagtc 15480
ccacacctcc agcgggcgca gccaggaccg cagcacgtcg ttgacggcga gggacagctc 15540
ccgcatctcc tccaccggca ggtccagcgc ccgcaggtag tcctcgccga tcggctggtc 15600
ctcggggtcg gtccggtagt cgaacttcac gaccgggcgc ggcagcgggg cgccgtcctc 15660
gaacaggccg gggtacttgc gggtggtcga ccccgtggcc cggttcttca cgatcacctc 15720
gaacggcggc gccgggcagt agtcggccac gtacgagtcg gccgagaccc gttcccggaa 15780
cgcgcacggc acgccggcgt cggccagccg ggccgccgcc ctttcgtaga aatcgagccg 15840
gagcggcccg gtgcgctcga cgatttcatc acggtggtac gtgaaactgc gcaggctggg 15900
gattatctgc accaggcacc gcccgtcggg cagcagccac agacgcttac tgcgcccgac 15960
gatatccggc tctccccggc cggccggagt tcccggctct gatccgttat ccacgccgcc 16020
tccatccgcc gacaatggcc cacacaactt gatattgcgg gcacctcgtt gtcaaggcgg 16080
cgtgaccctg gaaaaaggag cacttgacgt ggctctcgcg gtcttgatat gtcgtccgcc 16140
ctcgccggga gcactggaag aaagagtact tgacgggtcc cggcgaccgg acggtactgt 16200
cttcagaaat tgaaattccg atcttaatta gaatcccaat ggggggagga tcgttgtcgg 16260
aatatgagcg catagctcac tcgaagatca cgcgtaagtt caggtccgcg gggcagcagc 16320
tcttgttctc gcgggggcgg cggggcgagg tgtccgacgc gaacggcgta ccgtacatcg 16380
atttcgtcat gggttatggc ccagtgataa tcggccacgc ggacgcccag ttcaacgaaa 16440
ttctctgcgg ttatctcggc aacggagtca tgctgccggg ctacaccacg ttccaccagg 16500
aatacctcga ccggctgctc ggcgagcggc ccggcgaccg cggcgcgttc ttcaagaccg 16560
cctccgaggc cgtcaccgcc gccttccggc tggccgcgat gcgcaccggc cggctcggta 16620
tcatccgcag cggctacgtc ggctggcacg actcccagat cgccgactcc ctcaagtggc 16680
acgagccgct gcactccccg ctgcgggaca agctgcgcta caccgacggc atgcgtggcg 16740
tcggcgagtc cgagccggtg gccaactggg tggacctgcg cctggaatcc ctcgccgagc 16800
tgctggaacg gcaccgcggc cggctgggct gcttcgtctt cgacgcgtac ctggcctcct 16860
tcaccacggc cgacgtgctg cgccaggccg tcgccatgtg ccgcgaggcc ggcctgctca 16920
ccgtcttcga cgagaccaag accggcggcc ggatctcgcc tctcggctac gaccacgaca 16980
acgccctcgg ctccgacctg atcgtgatcg gcaaggcgct ggccaacggc gccccgctga 17040
gcatcctggc cggcgacgcc gacctgctcg cgctggccga gaaggcccgg ctgagcggca 17100
cgttctccaa ggagatgatc gccgtctacg cggcgctggc cacccgcgac atcctggaga 17160
agcccgtcgg cgactcgccg gacggctgga ccgagctcgg ccggatcggc acgcaggtgg 17220
ccgccgcctt caccgcggcc gccgcggacg ccggggtcga ggccctggtg ggcgcccgtc 17280
cggtgctcgg cggcggcatg ttcgagctcg tctaccacga cgtggagcta ctgggcgaca 17340
aggagcgccg cgaggcgctg ctggccgagc tcgccggggt cggcatcctg ctgctcgaag 17400
ggcacccgtc cttcgtgtgc ctggcccacc gcgacatcga ctggggcgac ctgcgcgacc 17460
gggtgcggca ggcgttcgag gcgtggaccg cccccacggg ggccggccgt ggctgacctc 17520
ctgcgcctcg ccctggtcgg ctgcggccgc cagatgcagc agaacctgtt cccgttcctg 17580
caacgcatcc gcggccacca ggtcgtggcc tgcgtcgacc ccgacctgtc gctggccgcc 17640
gacgtgcagt cgctggccgg cggcgcgacc tgcgtctcct cggtggacga gctcgacctg 17700
gagatggtgg acgccgccgt gctggccgtc ccgccggagc cgtcctacct gctggtccgg 17760
cagctcgccg agcgcggggt ggactgcttc gtcgagaagc cggccgggcc gtccacgccc 17820
gcgctccagg acctggagca cgtggtgcgc cgcagcgggc ggcacgtcca ggtgggcttc 17880
aacttccgct acgccgagac cctgcagcgg ctgcacgagc tgagcgagga gatccgggcc 17940
acgccgtgct cggtgaccat cgacttctac agccgccacc cgtccgcgcc gcagtggggg 18000
gtggacacca ccctggaggc gtggatccgg cacaacgggg tgcacgccat cgacctggcc 18060
cgctggttcg tgccgtcgcc ggtcgtccag gtggacgcgc acgccatcgc cagtgacgcc 18120
gaccggttcc agatcaacct gttcctgcgg caccacgacg gctcgctgtc cacgctgcgc 18180
atgggcaacc acgtcaagcg gttcatggtg ggcgtcaccg tccagggcat ggacggcagc 18240
cggtacagcg cgccgtcgct ggaacgggtg acgctcgaac tgtccgacgg cgtgccccac 18300
ggccaggagc tgcacgccac ccgcaacctg gaccacgggt gggggcgcag cggcttcggg 18360
cccgagttgc aggcgttcgt ggacgcctgc gcgcagcggt cggccgagcc ccagacgggc 18420
ggccgtcccc cggtcaaggg cgtgccgtcg gtctccgacg cgctggccgc ctccgccctg 18480
tgcgaccggg tgatggccga gctcaacacc gccgcgacga acggcttcgg cctgctgacc 18540
gggtccgtcc gggcctcggc ctaggccggg ccgtcgacga ggagggggag gcatgaccga 18600
gcgggcgcga ccggccgcgg gcgggcggcc ggtcgcgtcg ccggatccgg ccgggtcacc 18660
tgatccggcc gcggcgcggg gggtgggccg gtggatcgcc ggtcatctgc gccggcaccg 18720
gccgtcgctg ctgctgttca ccgtggccgc cgcggcggcg agcctgtgct ccacgctcgt 18780
ccccgtccag atcggcgcag cgttcgccga ggccaccggc ccgcgccacg acctcggcgc 18840
ggtcggggtg gcggcgctgg cggccgcgct gctggcggcc ggccggttcg tcgccgacct 18900
gctgtcgaac ggctcgatgg aggtcgtcgc ccagcgcgtc aaacgggacg tgcgcgacga 18960
gctgtaccgc agcctgctga ccaagcgcat ggccttccac gaccagcagc gcatcggcga 19020
catcctggcc cgcgccatca acgacgcgag actggtcgac tacatgctca gccccggggc 19080
ggccaccgcc gccaacggcg tgctggccct gttggtgccg atcctgttca tcgcctcgct 19140
ggacccgcgg ctgctgctgg cccccggcgt gctggtcctg gtgttcgcct acgcgctgcg 19200
ttaccacctg cgccggctct acccgctggt catgcgcacc cgcgagacct tcgccgacct 19260
caacgaacgc ttctccacca cgctgtcggg catcgccacc gtcaaggcag ccacgcagga 19320
ggacttcgag cggcacgccc tgagatcggc cgccgccgcc taccgggacg ccttcgtccg 19380
gcgcggccgg gcgcaggcgg tctacctgcc ggcgctgagc ttcgccctcg ccatggtgac 19440
cggctccctg cactccctct acctctactc gcagggggag ctggcgctgt cccaggtggt 19500
gacctacctc ggctggctgc tgctgttcgc gcagcccgtc accatgtccg agcaggccgt 19560
gccggtcatc caggagggct tcgcggccgc cgcgcgcatg cggcagatca tcgaaggcgc 19620
ccccggcgag cgcgaggaca cccgcggcag cacggccgcc gtcgagggtg cgatcacctt 19680
cgaccgggtg tcgctgcgcc acgagggccg ggacatcctg cgcgaggtct ccttccacct 19740
gcccgccggc cgcaccctcg ccgtcgtcgg ccccaccggc agcggcaaga gcatgctgat 19800
caaactggtc aaccgcatgt acgacgccac cgagggccgg gtgctgatcg acgggcggga 19860
cgtgcgcgag tgggagccgg gcgcgctgcg ccgccagatc ggccacgtgg accaggagat 19920
cttcctgttc tccaagagcg tcctggacaa catcgccttc ggcgccccgc acgccgccgg 19980
gtacgaggac gtcctgcgcg tcgccaagca ggcgtgcgcg gacgagttcg tccaggggat 20040
ggccgacggc tacgccaccg tgctcaacga gggcggcacg acgctgtccg gcgggcagcg 20100
gcagcggctg gccatcgcca gggccctgct gaccgagccg cgcatcctca ccctcgacga 20160
cgcgaccagc gccgtggacg cccgtaccga atcggccatc accgaggcga tcgagcgcgc 20220
gacggcgggc cgcacgacgg tgctggtctc gcaccgcccc ggccagatcc gccgcgccga 20280
cctcatcctg ctgctggacg gcggccgggt cgtcgaccag ggctcgcacg acgaactgat 20340
ggcgcgctgc gcgctctacc gggagatcta cagtggctga cgcccaggat cggcggcgcc 20400
cgtgagcgac ctgttcgcgg gcctgcacgc cgacgcctac gaccgcgtgt acggcaaccg 20460
ccgcctgctg gcccgcatcc tgcggcagtt gcgcgcccac cgcgccggca tgatcggctc 20520
ggccctgctc gtcgcgggct cggtgctgct gaccgcctcc gtgccgatcg tcgtctcccg 20580
cctgatcgac caggccggcg gcggcagcgg ccccggcctg ctgctgtgcc tgttcgtggt 20640
cgcggcggcg gtgctggcct gggcgatgaa ctgggcccgc caggcgatca ccgcgcggct 20700
ggtgggcggc atggtctacc ggctgcagtg cgaggccgcc gacgccgcgc tcgccaagga 20760
cgtcgccttc tacgacgaga actcggtcgg caaggtgctc agccgggtca ccggcgacac 20820
cgagggcttc ggctcggtgc tcaccctcac cctgaacctg atcagccagc tgttcctggt 20880
ggtgctgctc ggcggcgtgg tgttctggat cgaccccggc ctcggcctgg tcatcctggc 20940
ggcgctgcct ttcctgctgg gcaccgccct ggccttccgc cgggtggccc gcacggcggc 21000
ggcggccacg cggcgggtga tggccaaggt caacgccaac gtgcacgaga ccatgctcgg 21060
catcgcggtc gccaagagct tcggccgcga gcgggccgtc cacgacgact tcgacgacgt 21120
caaccggctg tcgtaccgcg tctacgtccg gcagggcctc atctacgcgg tcatcctgcc 21180
ggtcctgacg ctgctggcgg gcctggccac ggcggtcgtg ctgtaccagg gcggccggtt 21240
cgcggcgatc ggcaggctct cggcgggcga gtggtacttc gccctccagg cgctcggcat 21300
gctctggcag ccggtcaccc aggccgcctc gttctggagc ctgttccagc aggggctggc 21360
cgccgccgag cgggtgttcg cgctgatcga cagcgaccac gccgtcgtcc agagcggcaa 21420
cgcgccggtg accgcgctga ccggcgagat cgaggcgcgc ggcctgcact tccggtacgg 21480
ctcgggcgcc gccgtgttcc gcgacttcga cgtgcgcctc gcggcccgcg agacggtggc 21540
ggtcgtcggg cacaccggcg gcggcaagtc cacgctggcc aagctgatcg cccgcgccta 21600
cgacttccag ggcggcgagc tgctggtcga cggccgcgac atccgcggcc tggacctgcg 21660
ggcctaccgg cggcgggtgg gggtcgtccc gcagcacccg ttcatcttcg cgggcacgct 21720
cgccgagaac atcgcctacg gccgccccgg cgccggccgc gacgacgtcg tacgggccgt 21780
cgagcggatc ggccaccgcg tgtgggaccg cagcatgccg atgtccctgg acgaccgcct 21840
ggccgccggc gggcagggcg tgtcggtcgg gcagcggcag ctcatcgccc tggccaggat 21900
gttcgtccgc gagccggaca tcctgctgct ggacgagccg accgccagcg tcgacccgct 21960
gaccgaacgc ggcatccagg acgcgctggc ccggctgtgc gccgggcgca cggtcgtggt 22020
gatcgcccac cggctgtcca ccatccggcg cgccgaccgc gtcctcgtgc tgcacggcgg 22080
cgagatcgcc gagcagggcc ggttcgacga gctgctgcgc cgcgacggcc cgttcgccgc 22140
gctctacgcc acctactacg cgcaccagga gaccgcatga cctgcgcccc ccatccggcg 22200
gagttcgagc cgatccagcg ggaggcgtcc gccacgggcg tgcgctgcgt ggtcggcgcc 22260
gtgctgttca acccgctcgg cgaggtgttc ctgcaacggc gggcggccca cgtgcggctc 22320
ttccccggct gctgggacat cgtcggcggg cacgtcgagc ggggcgagac gctgtgcgcg 22380
gcgctggcca gggagatcga ggaggagacc ggctggcggc tgctgcgggt gggcggcctg 22440
gtggacgtct tcgactggac cggcggcgac ggcggcctgc ggcgcgagat cgacgtgctg 22500
gccacggtcg agggcgacct gacccggccc gccatcgagc aggacaagtt cgacgaggcc 22560
cgctggctgg acggcgacgc cctgcgccgc ctggccgccg agtcgcccgg ctcgggcatg 22620
gtcgagctgg ccctgcgcgc gctggccatg cggccggtct agggctccgg tttgcgtacc 22680
gcgtccaggg cctgtcgcag ggtgaaccgg ccggcgtaca gcgccttgcc caccaccacg 22740
ccctcgacgc ccagcggggt cagggaggag acggcgcgca ggtcgtccag gctggagatc 22800
ccgccggagg ccacgacggg ccggccggtg gcggcgcaga cgtcgcgcag cagcttcagg 22860
ttggggccgc cgagggtgcc gtccctggcg atgtcggtga cgacgtaccg ggcgcagcct 22920
gccgcgtcga ggcgggccag ggtctcgtac agctcgccgg cgtcgcgggt ccagccgtgg 22980
ctgcgtacgg tggtgcccca gacgtccagc tcgacggcca cccggtcacc gtgccggtcg 23040
atgacctccg ccacccacgc cggacgctcc agcgcgccgg tgccgagcac cacccggccg 23100
cagccggtgg ccagcgccgc cgccaggctc gcgtcgtcgc gcacgccgcc gcacagctcg 23160
accgggatgt ccagggcgcc gacgacctcg gcgatccgcg cccggttcga gccggtgccg 23220
aaggcggcgt ccaggtccac caggtgcacc cagtcggcgc cggagtcctg ccaggccagc 23280
gccgcgtcga gcgggtcgcc gtaccaggtc tcggaccccg actcgccgcg cagcaggcgg 23340
acggcccggc cgccgcgcac gtcgacggcg ggcaggaggg tcaggcatgt ggacactcgt 23400
cgtttcctct cccgtgaacc ccccttttcc atcatcgcga acactggtca ggaatcgaag 23460
tcaaccttcg gccaccacca tggttcactt gccgccgccc ggccggacgg ggacgcttgc 23520
cggtgcgcgc cgcacgccca gtcgccgggg gccggtcccc ctgttccgag gaggactgca 23580
tggtctcgct cgcgctgccc aagggcacgt tcctggagcg tcccgtcctc gatctgttcg 23640
ccgcggccgg gctggaggtg cgccggccgt ccgagcgcag ctaccgggcg agcatcgcct 23700
acgacggcgg cgtcgaggtg gccttccaca agccgcgcga gatcccgctc gccgtggaga 23760
gaggcgtctt cgacttcggc gtgaccggca ccgactggat cgaggagacc ggcgcgaagg 23820
tcgagctggt cgaggcgagc gggtgcgtgc cgccctggcg gctggtgctc gccgtcccct 23880
ccgggcatcc cgccgtggac gccgccggcc tgcccgccgg ggcccgggtc gcgaccggct 23940
tcccgaagat ctcccggcag tacttccaga gcgtgccgct gcccgtacgg atcgtcccgt 24000
cgttcggcgc cacggaggcg aaggtccccg agctggccga cgccgtcatc gagaccgacg 24060
gccccggctc cgccctggac gagcacgacc tgcgcgtcgt cgccacgttg cgcacctgct 24120
taccgcaggt cgtcgcgagc cccgccgcct ggcgggacgc ccgcagacgc gccgcgatcc 24180
agcgggtcgc gcggctgctg gcctcggtcg acgggggagc ggcccacgtg ctgctgaccg 24240
tccgcaccac cacgcgcgac ctgccacggg tggccgggtc gatgccggaa cggtcctggc 24300
gggccggcac cggcctgacc gagaacctgg tggtcgtgca ggggctggcg gcccggcgcg 24360
gcctggccga gaccatcggc ggcatcctgg ccgccggcgc gctcgacgtc atcgagtcgc 24420
gcgtcgggaa ggacgtcacg ccgtgagccg ccggatgatc gtgtgcgacc tcgacgggac 24480
gctcctggac tcccgcggcc aggtctccga gcgcaccagg accgccgtgc gccgcgcccg 24540
cgccgccggg cacgtgttcg tcatcgcgac cgcgcggccg gtccgcgaca cccgtccggt 24600
ggccgcggcg ctggaccacg cggccgtcgc cgtctgcggg aacggctcga tcaccttcga 24660
cttcggcagc gaggaggtgg tcgactaccg cccgctggac cggcagccgc tcgccgcgac 24720
gctggcgctg ctgcgcgacc gtttccccgg cgtgcgcctg ggcgccgagt gccgcctcga 24780
actgctcctg gaggacgcct tccacctgcc cgagccgctg gcccgcgacg cccggcgggt 24840
gccccgcctg gagggcgaga tcgaccggca cgacgtcggc aagctcatgg tccagctcga 24900
aggcgccgcc cgccagtact acgagaccgt gcgcgggctg ctgaccggct gcgaggtcac 24960
catctcggcc gacgtgttct gcgaggtcat gcggtccggc gtgaccaagg ccgcggccct 25020
ggagtcgatg gcgtccaggc tcggcctggg cagcgccgac gtgatcgcct tcggcgacat 25080
gccgaacgac ctccccatgc tgacctgggc cggaaccgcg gtggcggtgg ccaacgccca 25140
tcccgccgtg ctcggcgcgg tgggcgaggt gaccgcctcc aacgacgacg acggggtggc 25200
cgcgtggctg gaacggcacg ccatggccga tttctcagag aagtattgac gcttgtctga 25260
cgactattca tgatggtcgc tggagccggg aacgcaccgg ccaggatcgg cctgcggagg 25320
caggccgcca aggcccgcgc cgggcgcggg cgtcccgtct gagaggtgcc ccagtgcccc 25380
gattagccgt tgcccctgtg ttcgtccgcg tcctcgcact cttcgcggcg ttgtgctcgc 25440
tcgccgcctg caccgccgtc agccccgccg ccgagaagaa gggcgccacc tcggtcggcg 25500
tggccttcag ccgcggcggg cgcggcgaca gatccttcaa cgactcggtg ggggccggcg 25560
tcgaccgcgc gaaggccgag ctgggcgtct cggccaagga gctcagcccc aacgccgccg 25620
ggtccgacct ggaggagatc ctgcggctgc tggcccagac cggccacaac ccggtgctgg 25680
ccgtcggctt cttctacgcc cagccgctgg ccaaggtcgc ccccgccttc ccccgcaccc 25740
agttcgtcat catcgacgac gccaccgtga agctgcccaa cgtcaccaac gtggtgttcc 25800
gcgaggagca ggccagctac ctggtcggcg cggcggccgc gctcaagtcc gccagcggga 25860
agatcggctt cgtcggcggg gtgcagacgc cgccgcagga gaagttcctc gccgggtacg 25920
cggccggcgc ccgcgcggtg aaacccggca tccagctctc cagcgtgttc atgtcccagc 25980
cgcccgactt cagcggtttc tcggcccccg acaaggccca ggaggcggcc cgcggcatgt 26040
acgacggcgg cgccgacgtc gtctaccacg ccgccggggc gtccgggctc ggcgtcttcc 26100
aggcggccaa ggcggccggc aagtgggcga tcggcgtcga cgtcgaccag cgcaagacgg 26160
tcgatcccgg gctcgccggg gtgatcctga ccagcgccac caagcagctc gacgtcgcgg 26220
tcctgagcat cgtccgggac gcggtggcgg gccggccgat cgccggggtg cgcgagttcg 26280
ggctcaagga gggcggggtc ggctacgcca ccagcggcgg cgagctcgcc gacatccagc 26340
cgaggctgga ggaactgcgc aagcaaatcg tctcgggtga gatcaaggtg cccgccaggc 26400
cggcggagag ccgatgagcc gggcgcgggc ggcgggccgg gaagcgctgc gcgtgctcgc 26460
cgtcccggcg ctcgcgctgc tgatcgcgct cacggtcgcc accgcgctgc tggccgcggt 26520
gggggccgcg ccggggcgga cgtacctgac gatgctggag ttcggagtcc ggccggactc 26580
gctcgtctcc gcggtgaacc ggtcggccgt ctacttcttc gccgccctcg ccgtggccgt 26640
caccttccgg atgaagctgt tcaacatcgg ggtcgagggc cagtaccacg tggccgcgct 26700
gttcgcggcg gtcgccgggt cggccgtgcg gctgccgggc ccgctgcacc tggcgttcgt 26760
ctgcctggtc gccgtgctgg ccggagccct ctgggccggg atcgccgcgg tgctcaaggt 26820
gtggcgcggg gtctccgagg tgctgtccac gctgctgctg aactacgtcg ccaccgccgt 26880
cgtcgcctac ctgctggccg aggtcttcgc cgcgggcagc acctccggct cgacgcagga 26940
gctgccgccg tcggcccggc tgccgcagct cacgctgccg tcgggcgacc agctcggctc 27000
ggccacgctg ggcgcggtcg tgctgggcgt ggcggtgcac gtgctgctca cccgcacccg 27060
ctacggctac gagctgcggg ccggcggcct gaacccggtg gcggcgcgcg tccagggcat 27120
cgacccgcgc cggatgatcc tcgtcacgat gttgttgtcc ggcggcctgg cgggcctggc 27180
cgggctgccc gacctgctgg gccagaccta ccggtacggc atggacttcc cctccggcgc 27240
gggcttcacc gggatcgccg tggccctgct cggccgcaac acgcccgcgg gcatcgcgct 27300
ggccgccctg ctcttcgggt tcctggaacg gtcggcgctg atcctcgaca tcgagggcat 27360
ccccaaggag atcgtcacga tcatgcaggg cgtcatcgtg ctcagcgtcg tggtggccta 27420
cgaggtggcg cggcggatgg aacgcaggcg ggccgagcgg ctcgtacggg agcaggcgtc 27480
cccggcgctg gagccgtccg tatgagcgcg ggacggtacc ggtggtggcg ggccgcgctc 27540
gtcctgttgc tggcgatgtc ggtggtcagg gtgttcaccg gcgccgacga gctcaccggg 27600
tcgggcgtcg tgggggcggc gctgcggctg gcggtcccga tcgggctcgc cgggctgagc 27660
ggcctgtgga gcgagcgcgc gggcgtcgcc aacctcggca tcgagggcat gatgatcatc 27720
ggcacggtct gcggggcctg ggccgggtac gggttcggcg tctgggccgg ggtgctggcc 27780
ggcacgctcg gcggggtcgc ggccgggctg ctgcacgcgg tcgtcaccgt caccttcggg 27840
gtcgaccagg tggtgtgcgg catcgcgatc aacatcctgg cggtgggcgt ggcccggttc 27900
atgtcggtca tcgccttcag cggcctgccg ggggcggggg ccacccagtc gccgccggtg 27960
agcgggtcga tcggcacggt gagcgtgccc gtcctggcgg gcggcggggg cacgcccgac 28020
ctgctcggcg ggctggagcg cggccggttc ttcctggtgt ccgacctcgc cgggatcgtg 28080
cgggggttcc tgcacgaggt ctcgctgctc accgtggtcg ccgtcgcgct ggtcccgctg 28140
agcttcgtgc tgctctggcg taccgcgttc gggctgcggc tgcggtcggc gggcgagcac 28200
ccggcggcgg cggactcgct cggggtgccg gtgctgcgga tgcggtacta cggcgtgctg 28260
gtctccggcg ggctggccgg gctcggcggg gcctacctgg tcaccgtctc ctcgaccgtc 28320
taccgcgagg gccagaccgg cggccggggc ttcatcgggc tggccgccat gatcttcggc 28380
aactggcggc cgtccggcac ggccgccggg gcgctgctgt tcggttacgc cgacgccctg 28440
caactgcgcc aggccagcgc ggtgcacgtg ctgctgctgc tcggggccgt gctggcctgg 28500
gtcgccgccg tcgcgctcgt ccgtacgcgg ctgcgcctgg ccgcgacgct ggcgctgctc 28560
ggcgccgccg ccttcgcggt cttccagggc accgactcgg tccccaccca gctcgtgtac 28620
gccacgccgt acctcaccac gctgctggtg ctggtcacgg tcgcccaacg gctccgcatg 28680
ccggcctacg tcggcatccc ctaccggcga ggagacgcgt gatgaccatc gaccgggagc 28740
cccgcctgat ggggtacgcc gaccgctggt ccgcggcgcc cggtgagcgg ctggccgtgc 28800
acgcctccgg cccggacggg gacgccgacg ccgccctcgt ggacctcgtc cgggagagcg 28860
acctgacccc ggccgtgccg gtgacgatcg cgccgcggcc gctgcggccc ggctcccggc 28920
tggagacggc ctcggtgccg acgcccgcca ggggcggcct cgccctctgg atccggccga 28980
agcggctcgg gccgcgccag gtcgtggccc gcctgggcgc ctcggaatgg cggctgctgc 29040
tcgcgctcga cgaggagggc aggcccgagg ccaccgtgct gacgccgtcg gggagcatgg 29100
gctgccggct gcgctcggcc gtcccggcgg gggagtggtc actgctggcc gtggcgtggg 29160
aggcggcgcc gctgatgacc gtcctgcatc tgaacgactt ccgcaccccg ggcgtgagcg 29220
ccgggcgggc ccgcctgccg ctgccgtccc cgtcgggcga gctgctgatc ggggcgggcg 29280
acgagcatcc gtacgcgggc cggatcgccc tgcccatgct gaccgagggc ccgctcgcca 29340
tgcccggccg cgttccgctg accgacctga ccaccggtcc cgagctggtc aaacacctcg 29400
gccccgagct gctggccctg tgggacccgt ccgccgcccc cggcgcgccc gtcgtcccgg 29460
acgtcagcgg caacgggcac cacgcgaacg ccgtcaaccg gccgctggcc gggctcgccg 29520
ggccgggcgg cggcgcccgc accgcgctcc aggtcagccc ggaggacctc gacgacgccc 29580
gctggccggt gacgtgcgag ctgaccgtcc cgccggacgc cggctccacc gtgctcggcg 29640
tgcgcctgcg gacccggggg cagtcgtgcg tgctgccggt ggtcgtgcgc ccggacgcga 29700
cgcgcaggag cccggtcgcg gtcgtcctgc ccacgttcac ctacctcgcc tacgccaacc 29760
accggcagtc cagcgaggcc gagtacttcg gcgactaccg gcaggtcacc gaccgcgtga 29820
tcacgctctc cccggtggac gcctacctca acgagcaccg cgagctcggc ctgtccctgt 29880
acgaccagcg gcgcgagggc ggccacgtca cccactcctc cagacgccgc ccggtgctca 29940
acctcacccc cgactaccgc tggtggatga ccggcgcgcc gcgccacttc gccgccgacc 30000
tgctgctgct gcgctggctg cgctcggccg gcttcgcctt cgacgtgctg gccgacgagg 30060
acgtgcacga aggcggcgcc ggcctgctgt ccgcctaccg cgtcgtcctg accggcagcc 30120
accccgagta cgtcaccgcc cccgaacgcg cggccttcgg cgcgtacgtg tccggcggcg 30180
gccggctcat gtacctgggc ggcaacgggt tctactgggt gaccggcgtc gacccccggc 30240
acccgcacgc catcgaggtc cgccgcggcc acgtcggcga ctggtgggac gacgagcccg 30300
gcgaggacgt gctgacctcc accggcgagc ccggcggcct ctggcggcat cgcggcctgg 30360
cgccgcagga gctgctcggc gtcggtttcg ccgcccaggg ctgggggccc gcccccggct 30420
acacccgcct gcccgacagc taccgggagg aggcgtcctg ggtgttcgag ggcgtcggga 30480
ccgacccggt gggctgctgc ggtctcgcgc tcggcggggc ggcgggggac gagatcgacc 30540
gggccgcgcc cgagctcggc accccgccgg acaccctggt gctcgcgacc tccgcgggcg 30600
gccacggcca ccactaccgg ccccctaccg aggaggccgg gctgggcgcg caggccgtac 30660
gcgccgacat gaccctgcga ctgctcggat caggcggcgc ggtgttcgcc gccggatccg 30720
tcaactgggt gaccagcctc gcctgctgtg acggtgacat caggcgaatc acccggaacg 30780
tcctgtcccg attcctgcaa cccgcagctt gtgaggagga gtgacgtgac caccgcgagc 30840
gcgaggcacc tcgactaccc ctatccccgc acgtcggccg agctgaccga cgagatgttg 30900
tccatgcccg cagaggaact ggcggccatc gcccgccatc ctctgaccgt cttccccgac 30960
accaccgagc tctactaccg gctggcccgg gagatggccg acgagttacg cgcccgcaac 31020
atgcgcggcg agcccacccg ctggatcctg cccgtgggcc cgaaggcgca gtacccgatc 31080
ctggcgcgca tctgcaacga ggagcgcatc agctgggccg acgtgttcgc cttccacatg 31140
gacgagttcc tcgactggca gggccggccg gtggacccct cgcacccctt cagcttccgc 31200
ggctattgcg accgcaacct ctaccagctc ctcgaccccg acctgcggcc ccacccgggc 31260
aacgtcgtct tccccgacgt gttcgacccg gacgcgttcg gcacgcgtct gcgcggcgag 31320
ggcggcgccg acacctgtta cgcgggcttc ggctaccggg ggcacctggc cttcaacgag 31380
ccgccgtcca cgcgctggca ccagtacacc gccgcggagt tcgccgcctc gaccacgcgc 31440
gtcgtcccgc tgctcgacga cacgatcgtg gcccactccc accgggtgac cggcggctac 31500
acgcaggcga tcccgcggat ggcgatcacc gtcggcatgt cggagatcct ctccgcccgc 31560
cggctgcacc tgatcaccga cgggggcgcc tggaagcggt acatcgtccg ggtgctgctg 31620
ctcaccaccg agccggacgt gctgctcccg gtgaccttcg cgcaccacca ccccgacgtg 31680
cgcgtgaccg tcgacgccga cagcatcgcg ccctgcgccc tggggctggg ctcgtgaccg 31740
tcgtcgcggt cggggcctac atcgtcgacc tctacatgtt cggcgaccac ctgccggagc 31800
cgggcgagag cgtcaacgcc gccgactacg ccaccgcgca cggcggcaag gccgcgaacg 31860
tggccgtcgc cgcggcccgg ctgggcgcgc ccgcccggtt cgtcggctgc ctcggcgacg 31920
acgccgcccg gccggtggcg ctggccgagc tgcgcgccga gggcatcgag gtcggcgagt 31980
gcaccacggc cgtcggcgtc gcgaccgggc gcagcttcgt ctacgtcgac gcgggcggca 32040
ggcagatggt catgacctgg ccgggggccg ccgacctgct gccgcccgcc caggcggccg 32100
cggtcgccgc cggcctgccc cccgggtccg tgctcgtcct gcagggggag atcccgctgg 32160
agatctccta cgccgccgcg acggcggccc cgccgggcgt gcgcgtcctg ctcaacccga 32220
gcccggcggg accgttcctg gaggggaccg ggaaggagct gctggcccgg gccgacctgc 32280
tggtgctcaa cgagggcgag ctgtccgcgc tggccggcga gacccccggc gacctggacg 32340
acctgcgcgg caggaccagg ggacgcaccg tcgtcgtcac ccgcggcgag cacggcgccg 32400
agatcgccga cgacggcggc actatgctga tcgacgtccc cagcgtgcgc gttgtggaca 32460
ccaccggcgc gggtgacgcc ttcaccggcg ccctggcggc cgccctctgc gccggcgcgg 32520
gcctggtgcg cggcgtccgg ctcgcctgcc aggtggcgac cgtctcggtc acgcgccggt 32580
tctgtgcccc cagctaccca tcggccgccg agctcggcat cgccccgctc cgcggtccgt 32640
ccgtgcagga gagacccgcg tgacacgacg tcatcccacc ctggccatcc agatccgtga 32700
ccggctgcac gccctgatcg tcgagcaggg gctgcgcccc ggtgacaggc tgccgtcgga 32760
gaacgagctc atgaccctgt tcgacgtcgg gcgcaccagc atcagggagg cgttcaagct 32820
cctggagcag gaagggctca tccaggccag gcacggcgac ggccgctacc tcacctcgca 32880
gcccagcctg gaccggccgc tgacccgcct cgaaggcgtc accgagatgc tggccagccg 32940
cgggttcacc gccgacaaca ccgtcctcga cgtcatggcc accgagccgg accgccacca 33000
gcgggagctg ctgcaactgc cgcccggcga ggccatcgtc cggctggaac gcctgcgcag 33060
gcaccgggac gacgcgctgc tgtactcgat cgacctgttc ccgcgctcgc tcatcggccg 33120
ccccctggac gaggtggact ggaccggctc gctgttccag ctcctcaccg agcacggcca 33180
caccatcgcc tacgccgtcg cccaggtccg cgcggtgacc ctgagccacg cccaggccga 33240
gcgcatcggc acccacgagg acggcggcgc ctggctgctg ctcctccaga cccaccacgg 33300
cagcgccgga cggccggtgc tgtattcgca ggactaccac aggggcaccg acttctcctt 33360
ccacctggtg agacgccgcg actgagtgag ccgccgccgc gtaccgaccg accgccgttc 33420
gccgacgcag agaggatcat gcgatgagtg ccatcacgct ggaggccgcc cagccgatcc 33480
cggcctcccc ttccagcctc cgggcgaggt gctccaggct gtccgtcttc tggctcgccg 33540
aggacgaggt ctccttctac gggctcccgc cgctgctgat gaaggtgtcg aacgtggccg 33600
gggtcgcggt cgtgcgcagc tgcgcggagg cccggctgct gctggccggc gaggacttcg 33660
acgtggcggt gatcccgctc gccgtactgc cggacgtgct cggcggacgc ccccgccggg 33720
cccgcccgca gatcctggcc atggtgcgtc agggggagga gctcgccccc tgcgtcgccc 33780
agcggtacgg gctggccggt tgcctgctgt gggacgagat caatgtcgcc ggcttgtcgg 33840
gcaccttcga tcggctgctg cgcggcgacg cgcccgtccc gtccaggggc gcccccgacc 33900
tgctcggccg cctgaccgag cgcgagcaca tggtgctggc gctgctgctg cgcggcatga 33960
gcaaccacca gatcgcccgc tcgatgggca tctccatcca cggcgtcaag cgccacatct 34020
cgaacctgct cgtcaagttc aactgctcca accgcaccga ggtcgccctg gtcgcccagc 34080
gtctcggcct cgaccccgcc tcccgcaacc cgcgccacct gagcagcacc cgcacgtagc 34140
acgtacgacg aaccatcccc gaacggtcat cgagaaacgg agtgacaatg gacgtccctc 34200
tcatggaact cagcggccgc gcccccgtcg tcaggctgca tgacatcgag gcggacatgg 34260
ccgccgccac cgacgccatc aggtcgcagc tgaccggatg gggcttcatg gccgcggagg 34320
tgcccggcat cggcgagcgc gtcgaggcca tgatgaacga gttcgccgcg gcctgccggg 34380
cgaccgggcc gagcctgtcc gactacgcct acgacgtcgt cccgcagctc gccgtcggcg 34440
gcacgcacgg gttcttcccg tacaactcgg agatcccgcg cctggccaac ggcgtgcccg 34500
acccgaagga gttcatccac gtcagcggcg ccatgatcgg cgaccagccg cccggggcgg 34560
gtgacgtgct gcgggccttc ccggcgttcg gcacccgcgc cgccgaggtg ttcgacatcg 34620
ccttccggct gatctcgctc ttcggcgagg tcgtccgggg catgatgccg cccggcacgc 34680
cggagctgga cctctcgcac gacgcgacga acctgcgggt gatccactac cgggacgtcg 34740
gcgaccgcga ggtgctggcc cacgagcact ccggcatcca gatgctcggc ctccagctgc 34800
ccccgtccga ccagggcctg cagtacgtgc tgcacgacgg cacctgggtc gagccggtga 34860
tcgccgggac cgacgtcgtg ctgtgcaaca tcggccggat gctcaccagc gcctccgacg 34920
ggcggttccg gccgtccacg caccgggtgc acaccaagcc gatgccggcc ggctacgagc 34980
gcctgtcgtc ggtgctcttc gcctacccgc agcacaaggc ccgccagtgg aagatggtgg 35040
acggcgagct gatgtcgctg aacgccacct ggggcgactt catcgacagc cgcttccagg 35100
ggctcggcaa gcagtcctga ccacccggcc tcctggcggc cgggcggccg tcaggaggcg 35160
gtggcccgga cggccgcccg gatcacgtcg gcgacctcct tcggccgcga caccgcgacc 35220
gcgtgcgagg cgccggcgac ctccacggcg gtcgcgccgg cccgctcggc cccgaaccgc 35280
tgcacgtcgg ggttgatggc ctggtcggcg tcggcgatga cggcccacga cggcttggcc 35340
cgccaggccg ccgtggccgc cgcctcggtg aacgccccgg ccgccagggg ccgctgcgcc 35400
accgcgagga cccgggtgac ctcggccggc acgtcggcgg cgaagacctg ggggaaggcg 35460
tcggcggcga tggtgaactc gacggcggcg ccgccgtccg gcagcgggta ggccgcctgc 35520
cgcaggctcg cgccgagcgg cggctcgggg aagcggccct ggagctcgcc caggctctcg 35580
ccctcgtcca ggacgtacgc ggcgacgtag acgaggccga cgacgttctc cgccgcgccg 35640
gccacggtga tgatcgcgcc gccgtacgag tggcctgcca gcacgaccgg gccgtcgatc 35700
tgggccacca cggaggcgac gtacgccgcg tccgaggcca ggccgcgcag cgggttcggc 35760
ggggccacca cagggacgcc gtcccgctgc aactcggcga tgacgccgga ccagctcgcc 35820
gcgtcggcga acgcgccgtg cacgaggacg accgtcgggg tggtgctctc cgtcatggga 35880
aaaactcctt ctcggggcaa gaggggaccg gtcggccccg gtagcgatcc aacagccggg 35940
ccgctcgcgc gcacgaggcg gaaacgtact tctgcctgtg gttccgcctc gtctttcccc 36000
aggacggtcg cttggatggg tgacggcccc cctgcttccc accccttctt cctggagcga 36060
tcatgatcaa gcccgtcctg gaacccgccg ctcaggcgtt cgccgaggcc accgccgagc 36120
cgccgtacct cttccagctc ccgccggagg agggccgcaa ggccgtcaac gaggtgcagt 36180
ccggcccggc cgagctgccc gccgtggacg aggagtgggt gacggtcccc ggcggaccca 36240
ccggcgaggt cagggcccgc atcgtgcggc cggccggcag caccggcgac ctgcccgtga 36300
tcatctacat ccacggcgcc ggctgggtct tcggcaacgc ccacacccac gaccggctgg 36360
tccgcgagct ggcgaccggc gccggcgccg ccgtggtctt ccccgagtac gacctgtcgc 36420
ccgagcaccg ct 36432
<210> 2
<211> 479
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365(Actinomadura sp. ATCC 39365)
<400> 1
Met Arg Gly His Asp Pro Leu Asp Asp Thr Glu Gln Ala Leu Leu Pro
1 5 10 15
Ala Pro Ala Arg Gly Leu Ala Val Ala Leu Leu Leu Ile Ala Val Ala
20 25 30
Gly Val Ser Met Leu Leu Val Phe Gly Pro Gly Ala Gln Gln Pro Gly
35 40 45
Gln Ala Ala Ala Pro Ala Ala His Ala Gly Val Pro Pro Ala Gln Ser
50 55 60
Ile Leu Arg Leu Phe Leu Ala Val Ala Val Ile Gly Gly Leu Ala Ser
65 70 75 80
Leu Gly Gly Val Leu Ala Arg Arg Ala Gly Gln Pro Ser Val Val Gly
85 90 95
Gln Met Val Ala Gly Leu Leu Leu Gly Pro Ser Ala Leu Gly Ser Leu
100 105 110
Ala Pro Gly Val Ser Ala Leu Ile Leu Pro Gly Trp Val Ala Pro Gln
115 120 125
Leu Glu Leu Leu Ala Gln Ala Gly Leu Ala Val Phe Met Phe Thr Val
130 135 140
Gly Arg Glu Phe Asp Pro Ala Ala Leu Gly Arg Gln Arg Val Val Val
145 150 155 160
Gly Val Ala Ala Leu Ala Met Thr Ala Val Pro Phe Ala Leu Gly Ala
165 170 175
Val Ala Ala Val Pro Phe Ala Gly Ala Leu Ala Gly Pro Ser Ala Ser
180 185 190
Gly Thr Ala Phe Val Leu Phe Ala Gly Thr Ala Leu Ser Val Thr Ala
195 200 205
Phe Pro Val Leu Ala Arg Ile Leu Gln Glu Ala Gly Leu Thr Ala Thr
210 215 220
Arg Leu Gly Ser Leu Ala Ile Leu Cys Ala Gly Leu Ala Asp Val Leu
225 230 235 240
Ala Trp Cys Ala Leu Ala Ala Val Ile Ala Leu Ala His Ala Gly Ser
245 250 255
Pro Ala Gly Val Leu Thr Ser Leu Gly Leu Ala Ala Ala Leu Val Leu
260 265 270
Ala Val Thr Leu Val Val Arg Pro Ala Leu Ala Ala Leu Ala Ala Arg
275 280 285
His Ala Ala Arg Arg Leu Pro Ala Pro Ala Ala Leu Ala Leu Val Leu
290 295 300
Gly Leu Ile Phe Ala Leu Ala Ala Ala Thr Asp Ala Ile Gly Val His
305 310 315 320
Ala Ile Phe Gly Ala Leu Leu Ala Gly Val Ala Phe Pro Arg Asp Ala
325 330 335
Pro Val Leu Gly Ala Val Pro Glu Arg Leu Gly Ser Leu Asn Arg Ala
340 345 350
Met Leu Leu Pro Val Phe Phe Ala Ser Ile Gly Leu Arg Thr Asp Val
355 360 365
His Leu Ala Phe Gly His Pro Val Val Leu Leu Gly Gly Ala Val Leu
370 375 380
Leu Val Ala Ala Val Leu Gly Lys Leu Gly Ala Ala Gly Val Ile Ala
385 390 395 400
Trp Ala Gly Gly Met Pro Gly Arg Leu Ala Leu Gly Leu Gly Val Leu
405 410 415
Met Asn Ala Arg Gly Val Thr Glu Ile Val Val Leu Ser Thr Gly Leu
420 425 430
Gly Met Gly Leu Ile Ser Pro Gly Thr Phe Thr Val Leu Val Leu Met
435 440 445
Ala Leu Ile Thr Thr Met Met Thr Ala Pro Ala Leu Arg Arg Leu Gly
450 455 460
Leu Tyr Ala Pro Ala Arg Ala Glu Pro Arg Thr Pro Gln Pro Ala
465 470 475
<210> 3
<211> 402
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Met Asp Pro Arg Ser Ala Ala Lys Ile Thr Glu Ala Ser Pro Arg Arg
1 5 10 15
Ser Pro Trp Leu Thr Ile Leu Leu Val Tyr Val Gly Gly Val Val Gly
20 25 30
Ala Met Ser Leu Gly Lys Phe Ser Ser Val Gly Pro Ala Met Ala Ala
35 40 45
Gln Leu Gly Leu Ser Leu Pro Glu Leu Gly Trp Val Ile Ser Ala Val
50 55 60
Val Gly Val Gly Ala Val Val Gly Leu Pro Ala Gly Tyr Leu Val Arg
65 70 75 80
Arg Phe Gly Thr Gly Arg Ser Leu Val Thr Gly Leu Val Leu Ile Ala
85 90 95
Ala Ala Gly Ala Ala Ser Met Ala Ala Asn Asp Phe Ala Trp Leu Leu
100 105 110
Ile Ala Arg Gly Val Glu Gly Val Gly Tyr Val Leu Val Thr Ile Ser
115 120 125
Cys Pro Ala Leu Ile Leu Arg Leu Ala His Glu Arg Asp Arg Gly Thr
130 135 140
Ala Leu Ser Val Trp Ala Thr Phe Val Pro Val Gly Leu Gly Ala Ser
145 150 155 160
Thr Leu Ala Gly Gly Ala Ala Gly Asp Ala Leu Gly Trp Arg Gly Trp
165 170 175
Thr Gly Leu Ile Ala Gly Leu Thr Leu Leu Ile Ala Ala Val Thr Trp
180 185 190
Ala Arg Leu Pro Arg Ala Asp Gly Ala Gln Ala Ala Gly Pro Val Pro
195 200 205
Arg Ala Arg Ala Leu Thr Trp Pro Gly Val Leu Ala Ala Ala Phe Cys
210 215 220
Leu Thr Ala Leu Val Thr Ile Pro Val Ile Val Leu Leu Pro Thr Leu
225 230 235 240
Leu Ile Glu Gln Tyr Gly Arg Ser Ala Ala Ala Ala Gly Ala Leu Thr
245 250 255
Ser Val Ile Ser Leu Leu Gly Val Ala Gly Gly Leu Ala Val Gly Val
260 265 270
Leu Leu Arg Arg Gly Val Arg Val Gly Val Leu Ala Leu Ser Gly Phe
275 280 285
Leu Val Val Pro Ala Ala Trp Ile Leu Tyr Ala Gly Gly Ser Ala Gly
290 295 300
Gly Ala Leu Thr Gly Ala Gly Val Ile Ser Met Glu Asn Gly Phe Leu
305 310 315 320
Gly Ala Leu Val Phe Ala Ala Leu Pro Leu Val Leu Glu Arg Leu Asp
325 330 335
His Ala Asp Val Gly Asn Gly Val Ile Ala Gln Ala Gly Ser Leu Gly
340 345 350
Ser Leu Leu Gly Pro Pro Leu Phe Gly Leu Val Ala Gly Glu Ile Gly
355 360 365
Tyr Gln Ala Leu Ile Pro Val Ile Ala Val Gly Ile Val Ala Ala Val
370 375 380
Gly Ala Leu Leu Leu Ile Gly Arg Arg Ala Ala Arg Pro Leu Pro Ala
385 390 395 400
Ala Asp
402
<210> 4
<211> 295
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Met Ile Ser Leu Ala Leu Pro Lys Gly Ser Leu Glu Arg Arg Ala Leu
1 5 10 15
Glu Leu Phe Ala Thr Ala Gly Leu Pro Val Arg Arg Arg Ser Asp Arg
20 25 30
Ser Tyr Arg Gly Thr Val Asp His Ser Gly Val Ser His Val Ala Phe
35 40 45
Tyr Lys Pro Arg Glu Ile Pro Leu Val Val Gly Asp Gly Thr Phe Asp
50 55 60
Leu Gly Leu Thr Gly Ala Asp Trp Val Glu Glu Thr Gly Ala Glu Val
65 70 75 80
Glu Val Val Glu Ser Phe Asp Tyr Ser Arg Asn Thr Glu Arg Pro Trp
85 90 95
Arg Leu Val Leu Ala Val Pro Ser Gly His Pro Ala Arg Ser Ala Ala
100 105 110
Asp Leu Asp Asp Gly Val Arg Val Ala Thr Glu Tyr Pro Asn Ile Gly
115 120 125
Arg Arg Phe Phe Arg Glu Ala Gly Arg Arg Ala Glu Val Val Val Ser
130 135 140
Tyr Gly Ala Thr Glu Ala Lys Ile Pro Glu Leu Ala Asp Ala Met Met
145 150 155 160
Asp Val Val Glu Thr Gly His Ser Leu Arg Gln Asn Asp Leu Arg Val
165 170 175
Ile Glu Thr Val Arg Thr Cys Gly Ala Gln Leu Val Ala Asn Pro Ala
180 185 190
Ala Tyr Ala Asp Pro Ala Arg Arg Gly Val Ile Arg Asn Val Ala Leu
195 200 205
Leu Leu Arg Ala Ala Trp Val Gly Pro Ala His Val Leu Leu Thr Val
210 215 220
Arg Ala Pro Ala Asp Arg Leu Ala Glu Val Val Pro Ala Met Pro Gly
225 230 235 240
Gln Ser Trp Arg Ala Gly Ala Asp Leu Ala Asp Gly Thr Met Val Val
245 250 255
Leu Gln Gly Val Leu Pro Arg Arg Glu Val Pro Glu Thr Ile Asp Ala
260 265 270
Leu Leu Ser Ala Gly Ala Val Asn Val Thr Glu Ser Gly Ile Gly Ile
275 280 285
Phe Ala Thr Gly Glu Ala Gly
290 295
<210> 5
<211> 234
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Met Ser Ser His Glu Thr Pro Ala Ala Leu Val Thr Gly Ser Asn Arg
1 5 10 15
Gly Ser Gly Arg Ala Ile Ala Glu Glu Leu Arg Glu Arg Gly Tyr Arg
20 25 30
Val Phe Ser Leu Asn Arg Ser Leu Ala Gly Glu Asp Trp Leu Gly Glu
35 40 45
His Arg Cys Asp Leu Ser Asp Pro Ala Gln Ile Ala Glu Ala Thr Ala
50 55 60
Gln Val Leu Gly Ser Ala Gly Arg Leu Asp Val Cys Val Ser Asn Ala
65 70 75 80
Val Asp Arg Val Leu Asp Pro Ile Ala Glu Leu Arg Ala Glu Asp Trp
85 90 95
Glu Arg Leu Val Ala Ile Asn Leu Ser Ala Cys Phe His Leu Leu Lys
100 105 110
Ala Val Leu Pro Ala Leu Arg Ala Ala Gly Gly Met Phe Val Ala Met
115 120 125
Gly Ser His Ala Gly Thr Arg Tyr Phe Glu Gly Gly Ala Ala Tyr Ser
130 135 140
Ala Thr Lys Ala Gly Leu Lys Ala Leu Val Glu Thr Leu Leu Leu Glu
145 150 155 160
Glu Arg Arg Asn Gly Val Arg Ala Cys Leu Val Ser Pro Gly Ala Ile
165 170 175
Ala Asn Leu Asp Gly Asp Gly Ser Pro His Lys Met Ser Thr Arg Ser
180 185 190
Val Ala Arg Cys Val Ala Thr Ile Val Asp Ser Trp Pro Asp Asp Leu
195 200 205
Val Val Gly Glu Ile Glu Val Arg Pro Ser Cys Leu Pro Glu Pro Pro
210 215 220
Val Val Gly Ile Asp Arg Leu Leu His Val
225 230
<210> 6
<211> 239
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Val Asp Asn Gly Ser Glu Pro Gly Thr Pro Ala Gly Arg Gly Glu Pro
1 5 10 15
Asp Ile Val Gly Arg Ser Lys Arg Leu Trp Leu Leu Pro Asp Gly Arg
20 25 30
Cys Leu Val Gln Ile Ile Pro Ser Leu Arg Ser Phe Thr Tyr His Arg
35 40 45
Asp Glu Ile Val Glu Arg Thr Gly Pro Leu Arg Leu Asp Phe Tyr Glu
50 55 60
Arg Ala Ala Ala Arg Leu Ala Asp Ala Gly Val Pro Cys Ala Phe Arg
65 70 75 80
Glu Arg Val Ser Ala Asp Ser Tyr Val Ala Asp Tyr Cys Pro Ala Pro
85 90 95
Pro Phe Glu Val Ile Val Lys Asn Arg Ala Thr Gly Ser Thr Thr Arg
100 105 110
Lys Tyr Pro Gly Leu Phe Glu Asp Gly Ala Pro Leu Pro Arg Pro Val
115 120 125
Val Lys Phe Asp Tyr Arg Thr Asp Pro Glu Asp Gln Pro Ile Gly Glu
130 135 140
Asp Tyr Leu Arg Ala Leu Asp Leu Pro Val Glu Glu Met Arg Glu Leu
145 150 155 160
Ser Leu Ala Val Asn Asp Val Leu Arg Ser Trp Leu Arg Pro Leu Glu
165 170 175
Val Trp Asp Phe Cys Leu Ile Phe Gly Arg Thr Ser Asp Gly Gly Leu
180 185 190
Thr Val Ile Ser Glu Ile Ser Gln Asp Cys Met Arg Leu Arg His Gln
195 200 205
Asp Gly Ser Pro Leu Asp Lys Asp Leu Phe Arg Asp Gly Val Ala Gly
210 215 220
Glu Glu Ile Val Asn Gln Trp Ser Arg Val Leu Asp Val Ile Thr
225 230 235
<210> 7
<211> 425
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Met Gly Gly Gly Ser Leu Ser Glu Tyr Glu Arg Ile Ala His Ser Lys
1 5 10 15
Ile Thr Arg Lys Phe Arg Ser Ala Gly Gln Gln Leu Leu Phe Ser Arg
20 25 30
Gly Arg Arg Gly Glu Val Ser Asp Ala Asn Gly Val Pro Tyr Ile Asp
35 40 45
Phe Val Met Gly Tyr Gly Pro Val Ile Ile Gly His Ala Asp Ala Gln
50 55 60
Phe Asn Glu Ile Leu Cys Gly Tyr Leu Gly Asn Gly Val Met Leu Pro
65 70 75 80
Gly Tyr Thr Thr Phe His Gln Glu Tyr Leu Asp Arg Leu Leu Gly Glu
85 90 95
Arg Pro Gly Asp Arg Gly Ala Phe Phe Lys Thr Ala Ser Glu Ala Val
100 105 110
Thr Ala Ala Phe Arg Leu Ala Ala Met Arg Thr Gly Arg Leu Gly Ile
115 120 125
Ile Arg Ser Gly Tyr Val Gly Trp His Asp Ser Gln Ile Ala Asp Ser
130 135 140
Leu Lys Trp His Glu Pro Leu His Ser Pro Leu Arg Asp Lys Leu Arg
145 150 155 160
Tyr Thr Asp Gly Met Arg Gly Val Gly Glu Ser Glu Pro Val Ala Asn
165 170 175
Trp Val Asp Leu Arg Leu Glu Ser Leu Ala Glu Leu Leu Glu Arg His
180 185 190
Arg Gly Arg Leu Gly Cys Phe Val Phe Asp Ala Tyr Leu Ala Ser Phe
195 200 205
Thr Thr Ala Asp Val Leu Arg Gln Ala Val Ala Met Cys Arg Glu Ala
210 215 220
Gly Leu Leu Thr Val Phe Asp Glu Thr Lys Thr Gly Gly Arg Ile Ser
225 230 235 240
Pro Leu Gly Tyr Asp His Asp Asn Ala Leu Gly Ser Asp Leu Ile Val
245 250 255
Ile Gly Lys Ala Leu Ala Asn Gly Ala Pro Leu Ser Ile Leu Ala Gly
260 265 270
Asp Ala Asp Leu Leu Ala Leu Ala Glu Lys Ala Arg Leu Ser Gly Thr
275 280 285
Phe Ser Lys Glu Met Ile Ala Val Tyr Ala Ala Leu Ala Thr Arg Asp
290 295 300
Ile Leu Glu Lys Pro Val Gly Asp Ser Pro Asp Gly Trp Thr Glu Leu
305 310 315 320
Gly Arg Ile Gly Thr Gln Val Ala Ala Ala Phe Thr Ala Ala Ala Ala
325 330 335
Asp Ala Gly Val Glu Ala Leu Val Gly Ala Arg Pro Val Leu Gly Gly
340 345 350
Gly Met Phe Glu Leu Val Tyr His Asp Val Glu Leu Leu Gly Asp Lys
355 360 365
Glu Arg Arg Glu Ala Leu Leu Ala Glu Leu Ala Gly Val Gly Ile Leu
370 375 380
Leu Leu Glu Gly His Pro Ser Phe Val Cys Leu Ala His Arg Asp Ile
385 390 395 400
Asp Trp Gly Asp Leu Arg Asp Arg Val Arg Gln Ala Phe Glu Ala Trp
405 410 415
Thr Ala Pro Thr Gly Ala Gly Arg Gly
420 425
<210> 8
<211> 351
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Val Ala Asp Leu Leu Arg Leu Ala Leu Val Gly Cys Gly Arg Gln Met
1 5 10 15
Gln Gln Asn Leu Phe Pro Phe Leu Gln Arg Ile Arg Gly His Gln Val
20 25 30
Val Ala Cys Val Asp Pro Asp Leu Ser Leu Ala Ala Asp Val Gln Ser
35 40 45
Leu Ala Gly Gly Ala Thr Cys Val Ser Ser Val Asp Glu Leu Asp Leu
50 55 60
Glu Met Val Asp Ala Ala Val Leu Ala Val Pro Pro Glu Pro Ser Tyr
65 70 75 80
Leu Leu Val Arg Gln Leu Ala Glu Arg Gly Val Asp Cys Phe Val Glu
85 90 95
Lys Pro Ala Gly Pro Ser Thr Pro Ala Leu Gln Asp Leu Glu His Val
100 105 110
Val Arg Arg Ser Gly Arg His Val Gln Val Gly Phe Asn Phe Arg Tyr
115 120 125
Ala Glu Thr Leu Gln Arg Leu His Glu Leu Ser Glu Glu Ile Arg Ala
130 135 140
Thr Pro Cys Ser Val Thr Ile Asp Phe Tyr Ser Arg His Pro Ser Ala
145 150 155 160
Pro Gln Trp Gly Val Asp Thr Thr Leu Glu Ala Trp Ile Arg His Asn
165 170 175
Gly Val His Ala Ile Asp Leu Ala Arg Trp Phe Val Pro Ser Pro Val
180 185 190
Val Gln Val Asp Ala His Ala Ile Ala Ser Asp Ala Asp Arg Phe Gln
195 200 205
Ile Asn Leu Phe Leu Arg His His Asp Gly Ser Leu Ser Thr Leu Arg
210 215 220
Met Gly Asn His Val Lys Arg Phe Met Val Gly Val Thr Val Gln Gly
225 230 235 240
Met Asp Gly Ser Arg Tyr Ser Ala Pro Ser Leu Glu Arg Val Thr Leu
245 250 255
Glu Leu Ser Asp Gly Val Pro His Gly Gln Glu Leu His Ala Thr Arg
260 265 270
Asn Leu Asp His Gly Trp Gly Arg Ser Gly Phe Gly Pro Glu Leu Gln
275 280 285
Ala Phe Val Asp Ala Cys Ala Gln Arg Ser Ala Glu Pro Gln Thr Gly
290 295 300
Gly Arg Pro Pro Val Lys Gly Val Pro Ser Val Ser Asp Ala Leu Ala
305 310 315 320
Ala Ser Ala Leu Cys Asp Arg Val Met Ala Glu Leu Asn Thr Ala Ala
325 330 335
Thr Asn Gly Phe Gly Leu Leu Thr Gly Ser Val Arg Ala Ser Ala
340 345 350
<210> 9
<211> 595
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Met Thr Glu Arg Ala Arg Pro Ala Ala Gly Gly Arg Pro Val Ala Ser
1 5 10 15
Pro Asp Pro Ala Gly Ser Pro Asp Pro Ala Ala Ala Arg Gly Val Gly
20 25 30
Arg Trp Ile Ala Gly His Leu Arg Arg His Arg Pro Ser Leu Leu Leu
35 40 45
Phe Thr Val Ala Ala Ala Ala Ala Ser Leu Cys Ser Thr Leu Val Pro
50 55 60
Val Gln Ile Gly Ala Ala Phe Ala Glu Ala Thr Gly Pro Arg His Asp
65 70 75 80
Leu Gly Ala Val Gly Val Ala Ala Leu Ala Ala Ala Leu Leu Ala Ala
85 90 95
Gly Arg Phe Val Ala Asp Leu Leu Ser Asn Gly Ser Met Glu Val Val
100 105 110
Ala Gln Arg Val Lys Arg Asp Val Arg Asp Glu Leu Tyr Arg Ser Leu
115 120 125
Leu Thr Lys Arg Met Ala Phe His Asp Gln Gln Arg Ile Gly Asp Ile
130 135 140
Leu Ala Arg Ala Ile Asn Asp Ala Arg Leu Val Asp Tyr Met Leu Ser
145 150 155 160
Pro Gly Ala Ala Thr Ala Ala Asn Gly Val Leu Ala Leu Leu Val Pro
165 170 175
Ile Leu Phe Ile Ala Ser Leu Asp Pro Arg Leu Leu Leu Ala Pro Gly
180 185 190
Val Leu Val Leu Val Phe Ala Tyr Ala Leu Arg Tyr His Leu Arg Arg
195 200 205
Leu Tyr Pro Leu Val Met Arg Thr Arg Glu Thr Phe Ala Asp Leu Asn
210 215 220
Glu Arg Phe Ser Thr Thr Leu Ser Gly Ile Ala Thr Val Lys Ala Ala
225 230 235 240
Thr Gln Glu Asp Phe Glu Arg His Ala Leu Arg Ser Ala Ala Ala Ala
245 250 255
Tyr Arg Asp Ala Phe Val Arg Arg Gly Arg Ala Gln Ala Val Tyr Leu
260 265 270
Pro Ala Leu Ser Phe Ala Leu Ala Met Val Thr Gly Ser Leu His Ser
275 280 285
Leu Tyr Leu Tyr Ser Gln Gly Glu Leu Ala Leu Ser Gln Val Val Thr
290 295 300
Tyr Leu Gly Trp Leu Leu Leu Phe Ala Gln Pro Val Thr Met Ser Glu
305 310 315 320
Gln Ala Val Pro Val Ile Gln Glu Gly Phe Ala Ala Ala Ala Arg Met
325 330 335
Arg Gln Ile Ile Glu Gly Ala Pro Gly Glu Arg Glu Asp Thr Arg Gly
340 345 350
Ser Thr Ala Ala Val Glu Gly Ala Ile Thr Phe Asp Arg Val Ser Leu
355 360 365
Arg His Glu Gly Arg Asp Ile Leu Arg Glu Val Ser Phe His Leu Pro
370 375 380
Ala Gly Arg Thr Leu Ala Val Val Gly Pro Thr Gly Ser Gly Lys Ser
385 390 395 400
Met Leu Ile Lys Leu Val Asn Arg Met Tyr Asp Ala Thr Glu Gly Arg
405 410 415
Val Leu Ile Asp Gly Arg Asp Val Arg Glu Trp Glu Pro Gly Ala Leu
420 425 430
Arg Arg Gln Ile Gly His Val Asp Gln Glu Ile Phe Leu Phe Ser Lys
435 440 445
Ser Val Leu Asp Asn Ile Ala Phe Gly Ala Pro His Ala Ala Gly Tyr
450 455 460
Glu Asp Val Leu Arg Val Ala Lys Gln Ala Cys Ala Asp Glu Phe Val
465 470 475 480
Gln Gly Met Ala Asp Gly Tyr Ala Thr Val Leu Asn Glu Gly Gly Thr
485 490 495
Thr Leu Ser Gly Gly Gln Arg Gln Arg Leu Ala Ile Ala Arg Ala Leu
500 505 510
Leu Thr Glu Pro Arg Ile Leu Thr Leu Asp Asp Ala Thr Ser Ala Val
515 520 525
Asp Ala Arg Thr Glu Ser Ala Ile Thr Glu Ala Ile Glu Arg Ala Thr
530 535 540
Ala Gly Arg Thr Thr Val Leu Val Ser His Arg Pro Gly Gln Ile Arg
545 550 555 560
Arg Ala Asp Leu Ile Leu Leu Leu Asp Gly Gly Arg Val Val Asp Gln
565 570 575
Gly Ser His Asp Glu Leu Met Ala Arg Cys Ala Leu Tyr Arg Glu Ile
580 585 590
Tyr Ser Gly
595
<210> 10
<211> 592
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Val Ser Asp Leu Phe Ala Gly Leu His Ala Asp Ala Tyr Asp Arg Val
1 5 10 15
Tyr Gly Asn Arg Arg Leu Leu Ala Arg Ile Leu Arg Gln Leu Arg Ala
20 25 30
His Arg Ala Gly Met Ile Gly Ser Ala Leu Leu Val Ala Gly Ser Val
35 40 45
Leu Leu Thr Ala Ser Val Pro Ile Val Val Ser Arg Leu Ile Asp Gln
50 55 60
Ala Gly Gly Gly Ser Gly Pro Gly Leu Leu Leu Cys Leu Phe Val Val
65 70 75 80
Ala Ala Ala Val Leu Ala Trp Ala Met Asn Trp Ala Arg Gln Ala Ile
85 90 95
Thr Ala Arg Leu Val Gly Gly Met Val Tyr Arg Leu Gln Cys Glu Ala
100 105 110
Ala Asp Ala Ala Leu Ala Lys Asp Val Ala Phe Tyr Asp Glu Asn Ser
115 120 125
Val Gly Lys Val Leu Ser Arg Val Thr Gly Asp Thr Glu Gly Phe Gly
130 135 140
Ser Val Leu Thr Leu Thr Leu Asn Leu Ile Ser Gln Leu Phe Leu Val
145 150 155 160
Val Leu Leu Gly Gly Val Val Phe Trp Ile Asp Pro Gly Leu Gly Leu
165 170 175
Val Ile Leu Ala Ala Leu Pro Phe Leu Leu Gly Thr Ala Leu Ala Phe
180 185 190
Arg Arg Val Ala Arg Thr Ala Ala Ala Ala Thr Arg Arg Val Met Ala
195 200 205
Lys Val Asn Ala Asn Val His Glu Thr Met Leu Gly Ile Ala Val Ala
210 215 220
Lys Ser Phe Gly Arg Glu Arg Ala Val His Asp Asp Phe Asp Asp Val
225 230 235 240
Asn Arg Leu Ser Tyr Arg Val Tyr Val Arg Gln Gly Leu Ile Tyr Ala
245 250 255
Val Ile Leu Pro Val Leu Thr Leu Leu Ala Gly Leu Ala Thr Ala Val
260 265 270
Val Leu Tyr Gln Gly Gly Arg Phe Ala Ala Ile Gly Arg Leu Ser Ala
275 280 285
Gly Glu Trp Tyr Phe Ala Leu Gln Ala Leu Gly Met Leu Trp Gln Pro
290 295 300
Val Thr Gln Ala Ala Ser Phe Trp Ser Leu Phe Gln Gln Gly Leu Ala
305 310 315 320
Ala Ala Glu Arg Val Phe Ala Leu Ile Asp Ser Asp His Ala Val Val
325 330 335
Gln Ser Gly Asn Ala Pro Val Thr Ala Leu Thr Gly Glu Ile Glu Ala
340 345 350
Arg Gly Leu His Phe Arg Tyr Gly Ser Gly Ala Ala Val Phe Arg Asp
355 360 365
Phe Asp Val Arg Leu Ala Ala Arg Glu Thr Val Ala Val Val Gly His
370 375 380
Thr Gly Gly Gly Lys Ser Thr Leu Ala Lys Leu Ile Ala Arg Ala Tyr
385 390 395 400
Asp Phe Gln Gly Gly Glu Leu Leu Val Asp Gly Arg Asp Ile Arg Gly
405 410 415
Leu Asp Leu Arg Ala Tyr Arg Arg Arg Val Gly Val Val Pro Gln His
420 425 430
Pro Phe Ile Phe Ala Gly Thr Leu Ala Glu Asn Ile Ala Tyr Gly Arg
435 440 445
Pro Gly Ala Gly Arg Asp Asp Val Val Arg Ala Val Glu Arg Ile Gly
450 455 460
His Arg Val Trp Asp Arg Ser Met Pro Met Ser Leu Asp Asp Arg Leu
465 470 475 480
Ala Ala Gly Gly Gln Gly Val Ser Val Gly Gln Arg Gln Leu Ile Ala
485 490 495
Leu Ala Arg Met Phe Val Arg Glu Pro Asp Ile Leu Leu Leu Asp Glu
500 505 510
Pro Thr Ala Ser Val Asp Pro Leu Thr Glu Arg Gly Ile Gln Asp Ala
515 520 525
Leu Ala Arg Leu Cys Ala Gly Arg Thr Val Val Val Ile Ala His Arg
530 535 540
Leu Ser Thr Ile Arg Arg Ala Asp Arg Val Leu Val Leu His Gly Gly
545 550 555 560
Glu Ile Ala Glu Gln Gly Arg Phe Asp Glu Leu Leu Arg Arg Asp Gly
565 570 575
Pro Phe Ala Ala Leu Tyr Ala Thr Tyr Tyr Ala His Gln Glu Thr Ala
580 585 590
<210> 11
<211> 161
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Met Thr Cys Ala Pro His Pro Ala Glu Phe Glu Pro Ile Gln Arg Glu
1 5 10 15
Ala Ser Ala Thr Gly Val Arg Cys Val Val Gly Ala Val Leu Phe Asn
20 25 30
Pro Leu Gly Glu Val Phe Leu Gln Arg Arg Ala Ala His Val Arg Leu
35 40 45
Phe Pro Gly Cys Trp Asp Ile Val Gly Gly His Val Glu Arg Gly Glu
50 55 60
Thr Leu Cys Ala Ala Leu Ala Arg Glu Ile Glu Glu Glu Thr Gly Trp
65 70 75 80
Arg Leu Leu Arg Val Gly Gly Leu Val Asp Val Phe Asp Trp Thr Gly
85 90 95
Gly Asp Gly Gly Leu Arg Arg Glu Ile Asp Val Leu Ala Thr Val Glu
100 105 110
Gly Asp Leu Thr Arg Pro Ala Ile Glu Gln Asp Lys Phe Asp Glu Ala
115 120 125
Arg Trp Leu Asp Gly Asp Ala Leu Arg Arg Leu Ala Ala Glu Ser Pro
130 135 140
Gly Ser Gly Met Val Glu Leu Ala Leu Arg Ala Leu Ala Met Arg Pro
145 150 155 160
Val
<210> 12
<211> 245
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Val Ser Thr Cys Leu Thr Leu Leu Pro Ala Val Asp Val Arg Gly Gly
1 5 10 15
Arg Ala Val Arg Leu Leu Arg Gly Glu Ser Gly Ser Glu Thr Trp Tyr
20 25 30
Gly Asp Pro Leu Asp Ala Ala Leu Ala Trp Gln Asp Ser Gly Ala Asp
35 40 45
Trp Val His Leu Val Asp Leu Asp Ala Ala Phe Gly Thr Gly Ser Asn
50 55 60
Arg Ala Arg Ile Ala Glu Val Val Gly Ala Leu Asp Ile Pro Val Glu
65 70 75 80
Leu Cys Gly Gly Val Arg Asp Asp Ala Ser Leu Ala Ala Ala Leu Ala
85 90 95
Thr Gly Cys Gly Arg Val Val Leu Gly Thr Gly Ala Leu Glu Arg Pro
100 105 110
Ala Trp Val Ala Glu Val Ile Asp Arg His Gly Asp Arg Val Ala Val
115 120 125
Glu Leu Asp Val Trp Gly Thr Thr Val Arg Ser His Gly Trp Thr Arg
130 135 140
Asp Ala Gly Glu Leu Tyr Glu Thr Leu Ala Arg Leu Asp Ala Ala Gly
145 150 155 160
Cys Ala Arg Tyr Val Val Thr Asp Ile Ala Arg Asp Gly Thr Leu Gly
165 170 175
Gly Pro Asn Leu Lys Leu Leu Arg Asp Val Cys Ala Ala Thr Gly Arg
180 185 190
Pro Val Val Ala Ser Gly Gly Ile Ser Ser Leu Asp Asp Leu Arg Ala
195 200 205
Val Ser Ser Leu Thr Pro Leu Gly Val Glu Gly Val Val Val Gly Lys
210 215 220
Ala Leu Tyr Ala Gly Arg Phe Thr Leu Arg Gln Ala Leu Asp Ala Val
225 230 235 240
Arg Lys Pro Glu Pro
245
<210> 13
<211> 288
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Met Val Ser Leu Ala Leu Pro Lys Gly Thr Phe Leu Glu Arg Pro Val
1 5 10 15
Leu Asp Leu Phe Ala Ala Ala Gly Leu Glu Val Arg Arg Pro Ser Glu
20 25 30
Arg Ser Tyr Arg Ala Ser Ile Ala Tyr Asp Gly Gly Val Glu Val Ala
35 40 45
Phe His Lys Pro Arg Glu Ile Pro Leu Ala Val Glu Arg Gly Val Phe
50 55 60
Asp Phe Gly Val Thr Gly Thr Asp Trp Ile Glu Glu Thr Gly Ala Lys
65 70 75 80
Val Glu Leu Val Glu Ala Ser Gly Cys Val Pro Pro Trp Arg Leu Val
85 90 95
Leu Ala Val Pro Ser Gly His Pro Ala Val Asp Ala Ala Gly Leu Pro
100 105 110
Ala Gly Ala Arg Val Ala Thr Gly Phe Pro Lys Ile Ser Arg Gln Tyr
115 120 125
Phe Gln Ser Val Pro Leu Pro Val Arg Ile Val Pro Ser Phe Gly Ala
130 135 140
Thr Glu Ala Lys Val Pro Glu Leu Ala Asp Ala Val Ile Glu Thr Asp
145 150 155 160
Gly Pro Gly Ser Ala Leu Asp Glu His Asp Leu Arg Val Val Ala Thr
165 170 175
Leu Arg Thr Cys Leu Pro Gln Val Val Ala Ser Pro Ala Ala Trp Arg
180 185 190
Asp Ala Arg Arg Arg Ala Ala Ile Gln Arg Val Ala Arg Leu Leu Ala
195 200 205
Ser Val Asp Gly Gly Ala Ala His Val Leu Leu Thr Val Arg Thr Thr
210 215 220
Thr Arg Asp Leu Pro Arg Val Ala Gly Ser Met Pro Glu Arg Ser Trp
225 230 235 240
Arg Ala Gly Thr Gly Leu Thr Glu Asn Leu Val Val Val Gln Gly Leu
245 250 255
Ala Ala Arg Arg Gly Leu Ala Glu Thr Ile Gly Gly Ile Leu Ala Ala
260 265 270
Gly Ala Leu Asp Val Ile Glu Ser Arg Val Gly Lys Asp Val Thr Pro
275 280 285
<210> 14
<211> 264
<212> PRT
<213> 馬杜拉放線菌 ATCC 39365 (Actinomadura sp. ATCC 39365)
<400> 1
Met Ile Val Cys Asp Leu Asp Gly Thr Leu Leu Asp Ser Arg Gly Gln
1 5 10 15
Val Ser Glu Arg Thr Arg Thr Ala Val Arg Arg Ala Arg Ala Ala Gly
20 25 30
His Val Phe Val Ile Ala Thr Ala Arg Pro Val Arg Asp Thr Arg Pro
35 40 45
Val Ala Ala Ala Leu Asp His Ala Ala Val Ala Val Cys Gly Asn Gly
50 55 60
Ser Ile Thr Phe Asp Phe Gly Ser Glu Glu Val Val Asp Tyr Arg Pro
65 70 75 80
Leu Asp Arg Gln Pro Leu Ala Ala Thr Leu Ala Leu Leu Arg Asp Arg
85 90 95
Phe Pro Gly Val Arg Leu Gly Ala Glu Cys Arg Leu Glu Leu Leu Leu
100 105 110
Glu Asp Ala Phe His Leu Pro Glu Pro Leu Ala Arg Asp Ala Arg Arg
115 120 125
Val Pro Arg Leu Glu Gly Glu Ile Asp Arg His Asp Val Gly Lys Leu
130 135 140
Met Val Gln Leu Glu Gly Ala Ala Arg Gln Tyr Tyr Glu Thr Val Arg
145 150 155 160
Gly Leu Leu Thr Gly Cys Glu Val Thr Ile Ser Ala Asp Val Phe Cys
165 170 175
Glu Val Met Arg Ser Gly Val Thr Lys Ala Ala Ala Leu Glu Ser Met
180 185 190
Ala Ser Arg Leu Gly Leu Gly Ser Ala Asp Val Ile Ala Phe Gly Asp
195 200 205
Met Pro Asn Asp Leu Pro Met Leu Thr Trp Ala Gly Thr Ala Val Ala
210 215 220
Val Ala Asn Ala His Pro Ala Val Leu Gly Ala Val Gly Glu Val Thr
225 230 235 240
Ala Ser Asn Asp Asp Asp Gly Val Ala Ala Trp Leu Glu Arg His Ala
245 250 255
Met Ala Asp Phe Ser Glu Lys Tyr
260