專利名稱:制備普伐他汀的工藝的制作方法
專利說明制備普伐他汀的工藝 發(fā)明領(lǐng)域 本發(fā)明涉及生產(chǎn)普伐他汀的方法。
背景技術(shù):
他汀(Statins)是已知的3-羥基-3-甲基丁酰輔酶A還原酶(膽固醇生物合成中的限速酶)抑制劑。由此,他汀能夠在多種哺乳動物物種(包括人)中降低血漿膽固醇水平,這些化合物因此在對高膽固醇血癥的治療中是有效的。市場上有若干種他汀,包括阿托伐他汀(atorvastatin)、普伐他汀(pravastatin)、制甲羥酶素、洛伐他汀和辛伐他汀等。盡管阿托伐他汀是通過化學(xué)合成制造的,但是后四種是通過直接發(fā)酵生產(chǎn),或通過前體發(fā)酵生產(chǎn)。這些(前體)發(fā)酵由Penicillium、Aspergillus和Monascus屬的真菌完成。
普伐他汀(pravastatin)是在兩次順序發(fā)酵中生產(chǎn)的。首先Penicilliumcitrinum生產(chǎn)制甲羥酶素,制甲羥酶素的內(nèi)酯環(huán)被化學(xué)水解;隨后將得到的產(chǎn)物進(jìn)料至Streptomyces carbophilus培養(yǎng)物中,所述Streptomycescarbophilus培養(yǎng)物將其羥基化為普伐他汀。在本發(fā)明的語境中,術(shù)語“經(jīng)水解的制甲羥酶素”指制甲羥酶素的非內(nèi)酯形式,即其中通過與水反應(yīng)打開內(nèi)酯環(huán)(
圖1);同樣,術(shù)語“制甲羥酶素的水解”是指內(nèi)酯環(huán)的打開。使用不同的方法對生產(chǎn)這些代謝產(chǎn)物的工業(yè)物種和方法進(jìn)行最優(yōu)化。藉此將Penicillium citrinum的制甲羥酶素產(chǎn)量從原始的40mg/L提高至5g/L。對生物催化的轉(zhuǎn)化而言,Metkinen獲得了下述Streptomyces突變體菌株,其對具有80%轉(zhuǎn)化率的3g/L美伐他汀具有抗性(Metkinen News March2000,Metkinen Oy,F(xiàn)inland;reviewed by Manzoni and Rollini,2002,ApplMicrobiol Biotechnol 58555-564)。盡管在商業(yè)上具有活力,但是該工藝遠(yuǎn)非最優(yōu),因為與例如工業(yè)氨基酸或青霉素G生產(chǎn)相比,制甲羥酶素效價相對較低;另外,制甲羥酶素必須被稀釋,以防止對生物轉(zhuǎn)化中使用的Streptomyces菌株的毒性效應(yīng)(Hosobuchi et al.,1983,J Antibiotics 36887-891),并且20%的制甲羥酶素補(bǔ)料不能被Streptomyces菌株轉(zhuǎn)化。
從制甲羥酶素到普伐他汀的轉(zhuǎn)化由Streptomyces carbophilus的p450酶催化(見Matsuoka et al.,1989,Eur.J Biochem.184707-713)。Streptomyces細(xì)菌存在一種常見的問題,由于其在絲體中生長從而導(dǎo)致以高粘度培養(yǎng),引起低氧轉(zhuǎn)移率并因而引起更低的發(fā)酵產(chǎn)出。最適地,在大規(guī)模生物催化中廣泛使用的一種宿主——工業(yè)上經(jīng)良好裝備的物種如Escherichia coli會是有用的,但是該物種既不具有p450酶也不具有p450還原氧化再生體系。迄今為止,尚未報道適用于發(fā)酵和酶生產(chǎn)的物種如Escherichia coli在制甲羥酶素向普伐他汀的轉(zhuǎn)化中的用途。
另一個問題是p450酶對輔助因子再生的需要,這通常通過宿主細(xì)胞中存在的特定蛋白質(zhì)對來實現(xiàn)。如果該系統(tǒng)不是最適的,則總體轉(zhuǎn)化會實際上低于100%,如在制甲羥酶素例子中一樣。已進(jìn)行了多種嘗試來分離備選的物種,但其均不具有100%的轉(zhuǎn)化率(見US 6,905,851、US 6,365,382、US 2005/0153422、US 2004/0253692和US 2004/0209335)。另外,這些均不顯示超越Streptomyces carbophilus的真實改進(jìn)。還報道了對制甲羥酶素具有極高抗性的物種,但是它們僅提供效率很低的轉(zhuǎn)化(US 6,306,629、US 6,750,366)。其他人提出使用家族改組(family shuffling)作為改進(jìn)已知的p450酶轉(zhuǎn)化率的方法,但是未給出任何數(shù)據(jù)(US 6,605,430),因為事實上這會是非常困難的,因為p450酶可以非常具有底物特異性,不具有太多序列同一性并且需要特定的酶用于輔助因子再生。已嘗試通過分離使用不同的酶來進(jìn)行轉(zhuǎn)化的物種,以解決后一問題。該領(lǐng)域的一個具體例子是能夠以78%的最大轉(zhuǎn)化率將制甲羥酶素轉(zhuǎn)化為普伐他汀的Actinomadura物種(Peng and Demain(1998,J.Ind.Microbiol.Biotechnol.20373-375;US 6,274,360))。因此,盡管有所有的努力,但是僅具有80%轉(zhuǎn)化率的Streptomyces carbophilus仍然被用作普伐他汀轉(zhuǎn)化所選擇的工業(yè)物種,并且非常期望有所改進(jìn)。
發(fā)明描述 在本發(fā)明的語境中,術(shù)語“保守取代”旨在表示下述取代,其中氨基酸殘基被替換為具有相似側(cè)鏈的氨基酸殘基。這些家族是本領(lǐng)域已知的,并包括具有堿性側(cè)鏈的氨基酸(例如賴氨酸、精氨酸和組氨酸)、具有酸性側(cè)鏈的氨基酸(例如天冬氨酸、谷氨酸)、具有不帶電的極性側(cè)鏈的氨基酸(例如甘氨酸、天冬酰胺、谷氨酰胺、絲氨酸、蘇氨酸、酪氨酸、半胱氨酸)、具有非極性側(cè)鏈的氨基酸(例如丙氨酸、纈氨酸、亮氨酸、異亮氨酸、脯氨酸、苯丙氨酸、甲硫氨酸、色氨酸)、具有β-分支側(cè)鏈的氨基酸(例如蘇氨酸、纈氨酸、異亮氨酸)和具有芳香族側(cè)鏈的氨基酸(例如酪氨酸、苯丙氨酸、色氨酸、組氨酸)。
在本文中使用的術(shù)語“經(jīng)分離的多核苷酸或核酸序列”是指基本不含其它核酸序列的多核苷酸或核酸序列,例如通過瓊脂糖電泳測定為至少20%純凈,優(yōu)選地至少40%純凈,更優(yōu)選地至少60%純凈,進(jìn)一步更優(yōu)選地至少80%純凈,最優(yōu)選地至少90%純凈。例如,可通過遺傳工程中使用的標(biāo)準(zhǔn)克隆步驟獲得經(jīng)分離的核酸序列,從而將核酸序列從其天然位點再定位于其會被再生產(chǎn)的不同位點。
術(shù)語“普伐他汀”被定義為具有α-或β-構(gòu)象的制甲羥酶素的6′-羥基變體,或α-和β-構(gòu)象二者的混合物。在此處提及下述內(nèi)容是重要的在科學(xué)文獻(xiàn)中,術(shù)語普伐他汀僅用于制甲羥酶素6′-羥基變體的β-構(gòu)象,而α變體被稱作表-普伐他汀。然而,本發(fā)明描述了產(chǎn)生制甲羥酶素6-羥基變體的一種普遍有效的方法。因此,術(shù)語普伐他汀適用于α和β兩種形式。
本發(fā)明的一個目的是提供將制甲羥酶素轉(zhuǎn)化為普伐他汀的有效的并可工業(yè)應(yīng)用的方法。使用來自Amycolatopsis orientalis新穎的p450酶將制甲羥酶素轉(zhuǎn)化為普伐他汀是本發(fā)明的另一目的。本發(fā)明通過提供下述工藝解決了現(xiàn)有技術(shù)工藝中遇到的問題,所述工藝中在Escherichia coli中有效地進(jìn)行制甲羥酶素的羥基化。本發(fā)明還提供了其中以100%轉(zhuǎn)化進(jìn)行制甲羥酶素羥基化的工藝。更特別地,本發(fā)明提供了下述工藝,其中通過將Amycolatopsis orientalis的全細(xì)胞或無細(xì)胞提取物與制甲羥酶素接觸,使得制甲羥酶素與Amycolatopsis orientalis制甲羥酶素羥化酶(由cmpH基因編碼)接觸。優(yōu)選地提供了下述工藝,其中從Amycolatopsis orientalis中獲得制甲羥酶素羥化酶(cmpH),并將其轉(zhuǎn)移至另一宿主物種。優(yōu)選地,該宿主對高水平的制甲羥酶素有耐性并且能夠生產(chǎn)制甲羥酶素。
在第一個方面,本發(fā)明提供了選自下組的多肽,該組由以下組成具有根據(jù)SEQ ID NO 3的氨基酸序列的多肽和具有與SEQ ID NO 3序列基本同源的氨基酸的多肽,所述多肽展示出制甲羥酶素羥化酶活性。
在第一個實施方案中,所述多肽以至少50%、優(yōu)選地至少70%、更優(yōu)選地至少80%、進(jìn)一步更優(yōu)選地至少90%、最優(yōu)選地至少99%的效率來羥基化制甲羥酶素。優(yōu)選地,所述羥基化的產(chǎn)物是普伐他汀。
作為本發(fā)明的一部分,展示目前可獲得的制甲羥酶素羥化酶的工業(yè)應(yīng)用被限制于來自放線菌綱的物種;即它們不能被改變?yōu)楦m合工業(yè)規(guī)模發(fā)酵的物種如Escherichia coli或絲狀真菌如Aspergillus或Penicillium的物種。本發(fā)明所述的制甲羥酶素羥化酶基因不具有該問題。因此,由這些基因編碼的新穎多肽的活性可如下表征它們能夠應(yīng)用于除放線菌外的其它物種,例如Escherichia coli,和/或它們能夠以至少80%的轉(zhuǎn)化效率有效羥基化制甲羥酶素。在本發(fā)明的語境中,至少80%的效率表示至少80%的制甲羥酶素被轉(zhuǎn)化為普伐他汀。
具有與SEQ ID NO 3基本同源的氨基酸序列的多肽被定義為具有下述氨基酸序列的多肽,所述氨基酸序列與特定的氨基酸序列具有至少50%,優(yōu)選地至少60%,更優(yōu)選地至少75%,進(jìn)一步更優(yōu)選地至少90%,最優(yōu)選地至少95%,進(jìn)一步最優(yōu)選地至少97%,極限地至少98%,進(jìn)一步更極限地至少99%的同一性程度,所述基本同源的肽展示出制甲羥酶素羥化酶活性。基本同源的多肽包括多態(tài)現(xiàn)象,其可能由于天然的等位變異或菌株內(nèi)變異而存在于來自不同種群的細(xì)胞或種群內(nèi)的細(xì)胞中?;就吹亩嚯倪€可衍生自除特定氨基酸和/或DNA序列起源的物種以外的物種,或可由人工設(shè)計和合成的DNA序列編碼。與特定的DNA序列相關(guān)并通過遺傳密碼子簡并獲得的DNA序列也是本發(fā)明的一部分。同源物還包括全長序列的、仍然展示制甲羥酶素羥化酶活性的生物活性片段。
兩條氨基酸序列之間的同一性程度是指兩條序列之間相同的氨基酸的百分比。使用BLAST算法測定同一性程度,所述BLAST在Latched et al.(1990,J.Mol.Biol.215403-410)中描述。BLAST分析軟件可以通過National Center for Biotechnology Information(http://www.ncbi.nlm.nih.gov/)獲得。BLAST算法參數(shù)W、T和X確定比對的靈敏度和速度。BLAST程序使用下述作為默認(rèn)詞長(W)11、BLOSUM62評分矩陣(見Henikoffand Henikoff,1989,Proc Natl.Acad.Sci.USA 8910915)、比對(B)50、預(yù)期(E)10、M=5且N=-4。
基本同源的多肽可僅含有特定氨基酸序列的一個或多個氨基酸的保守取代,或非必需氨基酸的取代、插入或缺失。因此,非必需的氨基酸是在這些序列之一中可以被改變而不顯著改變生物功能的殘基。例如,涉及如何制造表型沉默氨基酸取代的指南在Bowie et al.(1990,Science 2471306-1310)中提供,其中作者指出存在兩種研究氨基酸序列對改變的耐受的途徑。第一種方法依賴于進(jìn)化過程,其中突變被自然選擇接受或拒絕。第二種途徑使用基因工程在被克隆的基因的特定位置上引入氨基酸改變,并選擇或篩選以鑒定維持功能性的序列。這些研究揭示了蛋白質(zhì)驚人地耐受氨基酸取代,并且揭示了在蛋白質(zhì)的某位置上何種改變可能是允許的。例如,大部分被埋藏的氨基酸殘基需要非極性側(cè)鏈,而表面?zhèn)孺溚ǔ:苌儆刑卣魇潜J氐?。其它這類表型沉默取代被描述于Bowie et al和其中所引用的參考文獻(xiàn)中。
在第二個實施方案中,可通過修飾編碼制甲羥酶素羥化酶的多核苷酸序列,來獲得導(dǎo)致改進(jìn)的催化功能(即制甲羥酶素成為普伐他汀的轉(zhuǎn)化)的變體。這些修飾包括 -以使得密碼子適應(yīng)用于表達(dá)制甲羥酶素羥化酶的宿主物種的方式,改進(jìn)密碼子使用 -以使得密碼子適應(yīng)用于表達(dá)制甲羥酶素羥化酶的宿主物種的方式,改進(jìn)密碼子對使用 -對編碼制甲羥酶素羥化酶的基因組信息添加穩(wěn)定序列,從而產(chǎn)生具有提高的半衰期的mRNA分子 -進(jìn)行易錯PCR引入隨機(jī)突變,然后篩選獲得的變體(基本如實施例4中所述)并分離具有改進(jìn)的動力學(xué)特性的變體 -對制甲羥酶素羥化酶相關(guān)變體進(jìn)行家族改組,然后篩選獲得的變體(基本如實施例4中所述),并分離具有改進(jìn)的動力學(xué)特性的變體 分離具有改進(jìn)的動力學(xué)特性的變體的優(yōu)選方法描述于WO03010183和WO0301311中。
當(dāng)獲得下述編碼具有改進(jìn)的功能性的制甲羥酶素羥化酶的改進(jìn)的多核苷酸時,就獲得了改進(jìn)的催化功能。作為本發(fā)明的一部分,已驚人地發(fā)現(xiàn)可使用SEQ ID NO 19、20、21、22、23、24、25或26的改進(jìn)的多肽序列或與其基本同源的序列,顯著地改進(jìn)制甲羥酶素6-羥基變體的β-構(gòu)象(即有藥物活性的普伐他汀異構(gòu)體)和制甲羥酶素6-羥基變體的α-構(gòu)象之間的比例。
另外,確定了本發(fā)明第一方面多肽序列中的某些序列段(stretches)直接涉及制甲羥酶素羥基化的催化機(jī)制。它們是SEQ ID NO 43、44、45、46和47??赏ㄟ^在SEQ ID NO 43-47之任一或所有中引入修飾,獲得改進(jìn)的催化功能。優(yōu)選地,通過替換單個氨基酸、兩個氨基酸、三個氨基酸或至多四個氨基酸修飾SEQ ID NO 43-47之任一或所有。確定了以下的修飾導(dǎo)致改進(jìn)的催化功能。對SEQ ID NO 43而言,優(yōu)選的修飾為SEQ ID NO 48、49和50,對SEQ ID NO 44而言,優(yōu)選的修飾為SEQ ID NO 51、52和53,對SEQ ID NO 45而言,優(yōu)選的修飾為SEQ ID NO 54,對SEQ ID NO 46而言,優(yōu)選的修飾為SEQ ID NO 55,對SEQ ID NO 47而言,優(yōu)選的修飾為SEQ ID NO 56、57、58和59。適合對制甲羥酶素羥基化做出貢獻(xiàn)的序列段也是SEQ ID NO 43-59,其中用備選的氨基酸替換一個、兩個或三個氨基酸。
在第三個實施方案中,提供了包含編碼上述多肽的DNA序列的多核苷酸或核酸序列。這可以是經(jīng)分離的基因組、cDNA、RNA、半合成的、合成來源的多核苷酸,或其任何組合。具體地,提供了編碼SEQ ID NO 3多肽的特定DNA序列,即SEQ ID NO 1或2。更優(yōu)選地,提供了編碼SEQ ID 19-26多肽的特定DNA序列,即SEQ ID 11-18。除非另有說明,使用自動化DNA測序儀測定本文中通過對DNA分子測序所測定的所有核苷酸序列,并且通過翻譯如上測定的DNA序列預(yù)測本文中測定的DNA分子編碼的所有多肽的氨基酸序列。因此,對通過該自動化途徑測定的任何DNA序列而言,測定的任何核苷酸序列可含有一些錯誤。通過自動化測定的核苷酸序列與被測序的DNA分子的真實核苷酸序列典型地至少約90%相同,更典型地至少與95%到至少約99.9%相同??赏ㄟ^其它途徑(包括手動DNA測序方法)更精確地測定真實的序列。還如本領(lǐng)域已知的,與真實序列相比,被測定的核苷酸序列中的單個插入或缺失會引起核苷酸序列翻譯中的移碼,從而由被測定的核苷酸序列編碼的預(yù)測氨基酸序列與被測序的核苷酸分子實際編碼的氨基酸序列會從這樣的插入或缺失點開始完全不同。本領(lǐng)域技術(shù)人員能夠鑒定這些被錯誤識別的堿基,并知道如何糾正這類錯誤。
本發(fā)明第一方面的多肽和編碼核酸序列可得自任何原核細(xì)胞,優(yōu)選地得自放線菌。優(yōu)選的放線菌物種包括但不僅限于Streptomyces、Amycolatopsis、Pseudonocardia、Micromonospora、Nocardia和Actinokineospora的菌株。在一個優(yōu)選的實施方案中,編碼本發(fā)明多肽的核酸序列得自Amycolatopsis orientalis的菌株。
可通過雜交鑒定本發(fā)明的DNA序列。對應(yīng)于本發(fā)明DNA的變體(例如天然等位變體)和同源物的核酸分子可基于它們與本文公開的核酸的同源性被分離,所述分離可使用本文公開的核酸或其合適的片段作為雜交探針,根據(jù)標(biāo)準(zhǔn)雜交技術(shù),優(yōu)選地在高度嚴(yán)格的雜交條件下進(jìn)行。或者,可通過可獲得的基因組數(shù)據(jù)庫應(yīng)用在計算機(jī)芯片上的篩選。雜交反應(yīng)的“嚴(yán)格度”可由本領(lǐng)域常規(guī)技術(shù)人員容易地確定。雜交反應(yīng)嚴(yán)格度的額外細(xì)節(jié)和解釋見Ausubel et al.(1995,Current Protocols in Molecular Biology,WileyInterscience Publishers)。
可通過例如篩選被研究微生物的基因組或cDNA文庫,來分離核酸序列。一旦例如用衍生自SEQ ID NO 2的探針檢測到編碼具有本發(fā)明活性的多肽的核酸序列,則可通過利用本領(lǐng)域常規(guī)技術(shù)人員已知的技術(shù)分離或克隆所述序列(見Sambrook et al.,1989,Molecular Cloning,A LaboratoryManual,2d edition,Cold Spring Harbor,New York)。也可實現(xiàn)從這類(基因組)DNA中克隆本發(fā)明的核酸序列,例如通過使用基于聚合酶鏈?zhǔn)椒磻?yīng)(PCR)的方法或?qū)Ρ磉_(dá)文庫進(jìn)行抗體篩選以檢測具有共享的結(jié)構(gòu)特性的被克隆的DNA片段來實現(xiàn)(見例如Innis et al.,1990,PCRA Guide toMethods and Application,Academic Press,New York.)。
本文提供的序列信息不應(yīng)被狹義地認(rèn)為需要包括被錯誤識別的堿基。本文公開的特定序列可被容易地用于分離來自放線菌(尤其是Amycolatopsis orientalis)的完整基因,這隨后可被容易地用于進(jìn)一步的序列分析,從而鑒定測序錯誤。
除非另有說明,使用自動化DNA測序儀測定本文中通過對DNA分子測序所測定的所有核苷酸序列,并且通過翻譯如上測定的DNA序列預(yù)測本文中測定的DNA分子編碼的所有多肽的氨基酸序列。因此,如本領(lǐng)域所已知的,對通過該途徑測定的任何DNA序列而言,本文中測定的任何核苷酸序列可含有錯誤。通過自動化測定的核苷酸序列與被測序的DNA分子的真實核苷酸序列典型地至少約90%相同,更典型地至少與95%到至少約99.9%相同。可通過其它途徑(包括本領(lǐng)域公知的手動DNA測序方法)更精確地測定真實的序列。還如本領(lǐng)域已知的,與真實序列相比,被測定的核苷酸序列中的單個插入或缺失會引起核苷酸序列翻譯中的移碼,從而由被測定的核苷酸序列編碼的預(yù)測氨基酸序列與被測序的核苷酸分子實際編碼的氨基酸序列會從這樣的插入或缺失點開始完全不同。本領(lǐng)域技術(shù)人員能夠鑒定這些被錯誤識別的堿基,并知道如何糾正這類錯誤。
在第四個實施方案中,本發(fā)明通過將SEQ ID NO 3的多肽與所謂的還原酶結(jié)構(gòu)域融合形成SEQ ID NO 6的多肽并展示制甲羥酶素羥化酶活性,提供了改進(jìn)的制甲羥酶素羥化酶。本發(fā)明的范圍不限于該特定的氨基酸序列,而是包括具有與SEQ ID NO 6序列“基本同源”的氨基酸序列的多肽,其被定義為具有下述氨基酸序列的多肽,所述氨基酸序列與特定的氨基酸序列具有至少60%、優(yōu)選地至少70%、更優(yōu)選地至少80%、進(jìn)一步更優(yōu)選地至少85%、進(jìn)一步更優(yōu)選地至少90%、進(jìn)一步更優(yōu)選地至少95%、進(jìn)一步更優(yōu)選地至少98%、最優(yōu)選地最后死耗99%的同一性程度,所述基本同源的肽顯示制甲羥酶素羥化酶活性?;就吹亩嚯目砂ǘ鄳B(tài)現(xiàn)象,其可能由于天然的等位變異或菌株內(nèi)變異而存在于來自不同種群的細(xì)胞或種群內(nèi)的細(xì)胞中?;就吹亩嚯倪€可衍生自除特定氨基酸和/或DNA序列起源的物種以外的物種,或可由人工設(shè)計和合成的DNA序列編碼。與特定的DNA序列相關(guān)并通過遺傳密碼子的簡并獲得的DNA序列也是本發(fā)明的部分。同源物也包括全長序列的生物活性片段,其仍然展示制甲羥酶素羥化酶活性。本領(lǐng)域技術(shù)人員應(yīng)當(dāng)明白,該融合蛋白的羥化酶部分被交換為非同源的、但是仍然是功能等同的序列,如能夠羥基化制甲羥酶素的其它p450酶,如Streptomyces carbophilus p450sca-2基因,只要融合蛋白展示朝向普伐他汀的制甲羥酶素羥基化即可。還可將還原酶結(jié)構(gòu)域交換為非同源的、但是仍然是功能等同的序列,例如鐵氧還蛋白和鐵氧還蛋白還原酶,只要融合蛋白展示朝向普伐他汀的制甲羥酶素羥基化即可??墒褂玫膫溥x的還原酶結(jié)構(gòu)域是例如來自Bacillus megaterium的自給(self-sufficient)P450酶,P450 BM3,NCBI Genbank登錄號gi142797。優(yōu)選的融合多肽是SEQ ID NO 19-26的改進(jìn)的多肽的同源物(congener),即SEQ IDNO 35、36、37、38、39、40、41或42或與其基本同源的序列。另外,編碼SEQ ID NO 34-42多肽的特定DNA序列(即SEQ ID NO 27-34)也是本發(fā)明的部分。或者,還在還原酶區(qū)域上進(jìn)行第二實施方案中所述的催化功能的改進(jìn)。
第二方面中,本發(fā)明公開了第一方面的多核苷酸在重組宿主菌株中的用途。更具體地,公開了用于生產(chǎn)普伐他汀的方法,包括步驟 (i)用包含編碼制甲羥酶素羥化酶的感興趣的基因的多核苷酸來轉(zhuǎn)化感興趣的宿主細(xì)胞, (ii)選擇經(jīng)轉(zhuǎn)化的細(xì)胞的克隆, (iii)培養(yǎng)所述選擇出的細(xì)胞, (iv)任選地加工所述經(jīng)培養(yǎng)的細(xì)胞(即固定), (v)向所述經(jīng)培養(yǎng)的細(xì)胞補(bǔ)充制甲羥酶素, (vi)從所述培養(yǎng)物中分離普伐他汀。
在本發(fā)明的方法中,對宿主細(xì)胞的選擇會在很大程度上取決于編碼多肽的感興趣的核酸序列(基因)的來源。優(yōu)選地,宿主細(xì)胞是原核細(xì)胞。在一個優(yōu)選的實施方案中,原核宿主細(xì)胞是下述物種的細(xì)胞,所述物種被引用為從中可獲得第一或第二方面多核苷酸的物種,其例子為,但不限于Streptomyces物種(即Streptomyces carbophilus、Streptomycesflavidovirens、Streptomyces coelicolor、Streptomyces lividans、Streptomycesexfoliatus)或Amycolatopsis物種(即Amycolatopsis orientalis)。在最優(yōu)選的情況下,宿主細(xì)胞是適合大規(guī)模發(fā)酵的宿主細(xì)胞,其例子為,但不限于Streptomyces的物種(即Streptomyces avermitilis、Streptomyces lividans、Streptomyces clavuligerus)或Bacillus的物種(即Bacillus subtilus、Bacillus amyloliquefaciens、Bacillus licheniformis)或Corynebacterium物種(即Corynebacterium glutamicum)或Escherichia的物種(即Escherichiacoli)。進(jìn)一步更優(yōu)選地,宿主細(xì)胞是真核細(xì)胞,如Saccharomyces、Aspergillus或Penicillium物種,其合適的例子是酵母Saccharomycescerevisiae或絲狀真菌Aspergillus niger、Penicillium chrysogenum或Penicillium citrinum。
核酸構(gòu)建體例如表達(dá)構(gòu)建體可含有選擇標(biāo)記物基因和本發(fā)明的多核苷酸(制甲羥酶素羥化酶),各自與一個或多個控制序列可操作地連接,所述控制序列指導(dǎo)編碼的多肽在合適的表達(dá)宿主中表達(dá)。核酸構(gòu)建體可以在獨立的片段上,或優(yōu)選地在一個DNA片段上。表達(dá)應(yīng)被理解為包括多肽生產(chǎn)中涉及的任何步驟,并可包括轉(zhuǎn)錄、轉(zhuǎn)錄后修飾、翻譯、翻譯后修飾和分泌。當(dāng)核酸構(gòu)建體含有編碼序列在特定宿主生物中表達(dá)所需的所有控制序列時,術(shù)語“核酸構(gòu)建體”與術(shù)語“表達(dá)載體”或“盒”同義。術(shù)語“控制序列”在本文中被定義為包括對多肽的表達(dá)來說必需的或有利的所有組件。每種控制序列對編碼多肽的核酸而言可以是內(nèi)源的(native)或外源的(foreign)。這類控制序列可包括,但不限于啟動子、前導(dǎo)序列、最適翻譯起始序列(如Kozak,1991,J.Biol.Chem.26619867-19870中所述)、分泌信號序列、前肽序列、多聚腺苷酸化序列、轉(zhuǎn)錄終止子??刂菩蛄兄辽侔▎幼右约稗D(zhuǎn)錄和翻譯終止信號。術(shù)語“可操作地連接”在本文中被定義為下述構(gòu)型,其中控制序列被適當(dāng)?shù)刂糜谂cDNA序列的編碼序列相關(guān)的位置,使得控制序列能指導(dǎo)多肽的生產(chǎn)。
控制序列可包括含有轉(zhuǎn)錄控制序列的適當(dāng)?shù)膯幼有蛄?。啟動子可以是在?xì)胞中顯示轉(zhuǎn)錄調(diào)控活性的任何核酸序列,包括突變的、截短的和雜交的啟動子,它們可得自編碼細(xì)胞外或細(xì)胞內(nèi)多肽的基因。啟動子對細(xì)胞或多肽而言可以是同源的或異源的。對原核細(xì)胞而言優(yōu)選的啟動子是本領(lǐng)域已知的,并可以例如是確保高水平信使RNA的強(qiáng)啟動子。根據(jù)本發(fā)明的表達(dá)盒中使用的啟動子可選自用于高度表達(dá)下述操縱子/基因的公知的誘導(dǎo)型啟動子集合,所述操縱子/基因如乳糖操縱子(lac,lacUV5)、阿拉伯糖操縱子(ara)、色氨酸操縱子(trp)和編碼所有芳香族氨基酸生物合成通用酶的操縱子(aro),或這些啟動子的功能雜合物,例如tac啟動子,其為trp和lac啟動子的融合物(Amann et al.,1983,Gene 25161-178)?;蛘呖墒褂迷诩?xì)胞的整個生命中提供恒定的信使RNA供應(yīng)的組成型啟動子。任何其它有用的啟動子可在諸如NCBI站點(http://www.ncbi.nlm.nih.gov/entrez/)中找到。
在一個優(yōu)選的實施方案中,啟動子可衍生自被高度表達(dá)的基因(在本文中定義為mRNA濃度至少為總細(xì)胞mRNA的0.5%(w/w))。在另一個優(yōu)選的實施方案中,啟動子可衍生自被中度表達(dá)的基因(在本文中定義為mRNA濃度至少為總細(xì)胞mNRA的0.01%至0.5%(w/w))。在另一優(yōu)選的實施方案中,啟動子可衍生自被低表達(dá)的基因(在本文中定義為mRNA濃度低于總細(xì)胞mRNA的0.01%(w/w))。
在一個進(jìn)一步更優(yōu)選的實施方案中,使用微陣列數(shù)據(jù)選擇基因,并進(jìn)而選擇這些基因的啟動子,所述啟動子具有確定的轉(zhuǎn)錄水平和調(diào)節(jié)。藉此,可以使基因表達(dá)盒最適地適應(yīng)其應(yīng)當(dāng)發(fā)揮功能的條件。
或者,可將隨機(jī)的DNA片段克隆在本發(fā)明的多核苷酸之前。這些可通過所謂的直接選擇途徑被分離。使用無啟動子的可選擇標(biāo)記物基因(即卡納霉素抗性),可將隨機(jī)的DNA片段克隆在該基因之前并容易地篩選活性啟動子,因為這些會有助于在含卡納霉素的培養(yǎng)基上的生長。這些DNA片段可衍生自許多來源,即不同的物種、經(jīng)PCR擴(kuò)增、合成等等。隨后可分離序列并講起克隆在本發(fā)明的多核苷酸之前。類似的策略可被用于通過由recDNA方法引入5′-非翻譯前導(dǎo)區(qū)來促進(jìn)信使RNA庫的翻譯,所述5′-非翻譯前導(dǎo)區(qū)來自被有效翻譯的信使RNA的前導(dǎo)區(qū),如其可得自編碼高度表達(dá)的延伸因子Tu蛋白質(zhì)的tuf基因或色氨酸操縱子的經(jīng)修飾的變體或合成變體。
控制序列還可以包括合適的轉(zhuǎn)錄終止子序列,這是被原核細(xì)胞識別為終止轉(zhuǎn)錄的序列。終止子序列與編碼多肽的核酸序列的3’末端可操作地連接。在細(xì)胞中有功能的任何終止子都可用于本發(fā)明中。對原核細(xì)胞而言優(yōu)選的終止子得自要被表達(dá)的天然基因,或得自如rRNA基因或病毒操縱子的來源,例如核糖體RNA終止子或fd終止子(Sambrook et al.,1989.Molecular Cloning 2nd edition;CSH Press)。
對于多肽的分泌而言,控制序列可包括編碼與多肽氨基端連接的氨基酸序列的信號肽-編碼區(qū),其能夠指導(dǎo)編碼的多肽進(jìn)入細(xì)胞的分泌途徑。編碼序列的5’端可固有地含有與編碼區(qū)區(qū)段按照翻譯讀碼框天然連接的信號肽-編碼區(qū),所述編碼區(qū)區(qū)段編碼被分泌的蛋白質(zhì)。或者,編碼序列的5’端可含有信號肽-編碼區(qū),其對于編碼序列來說是外源的。當(dāng)編碼序列不正常地含有信號肽-編碼區(qū)時,外源信號肽-編碼區(qū)可能是必需的。或者,外源信號肽-編碼區(qū)可以簡單地替換天然的信號肽-編碼區(qū),從而獲得多肽的增強(qiáng)的分泌。
核酸構(gòu)建體可以是表達(dá)載體。表達(dá)載體可以是任何載體(例如質(zhì)粒或病毒),其可便利地進(jìn)行重組DNA步驟并可導(dǎo)致編碼多肽的核酸序列的表達(dá)。載體的選擇應(yīng)典型地取決于載體與要引入載體的細(xì)胞的相容性。載體可以是線性的或閉合環(huán)狀質(zhì)粒。
在另一實施方案中,通過用trp啟動子或aro啟動子替換原始啟動子,額外地修飾上文提到的表達(dá)盒。為了完全利用表達(dá)效率的基本提高,可對用于創(chuàng)建實際生產(chǎn)菌株的recDNA構(gòu)建體應(yīng)用涉及提高的基因表達(dá)、信使RNA翻譯和質(zhì)粒穩(wěn)定性的額外修飾,如添加噬菌體fd的轉(zhuǎn)錄終止子,或引入來自質(zhì)粒pSC101的隔離功能(partitioning function)par(Churchward et al.,1983.Nucl.Acid.Res.115645-5659)。
為了提高期望的蛋白質(zhì)的生產(chǎn),可在染色體外元件上插入表達(dá)盒,如質(zhì)粒ColE1、ColD、R1162、RK2或其衍生物,所述質(zhì)?;蚱溲苌镆灶A(yù)定的低拷貝數(shù)或通常以動態(tài)的高拷貝數(shù)存在,并且能夠在例如Escherichiacoli菌株HB101、B7、RV308、DH1、HMS174、W3110、BL21中繁殖或自主復(fù)制。
載體可以是自主復(fù)制的載體,即作為染色體外實體存在的載體,其復(fù)制不依賴于染色體的復(fù)制,例如質(zhì)粒、染色體外元件、小染色體或人工染色體。或者,載體可以是下述載體,當(dāng)其被引入細(xì)胞時整合進(jìn)基因組中,并與其被整合在其中的染色體一起復(fù)制。整合型克隆載體可以隨機(jī)或在預(yù)先確定的靶基因座上整合進(jìn)宿主細(xì)胞的染色體中。在本發(fā)明的一個優(yōu)選的實施方案中,整合型克隆載體包括與宿主細(xì)胞基因組中預(yù)先確定的靶基因座中的DNA序列同源的DNA片段,用于將克隆載體的整合靶向該預(yù)先確定的基因座上。為了促進(jìn)定向整合,克隆載體優(yōu)選地在轉(zhuǎn)化宿主細(xì)胞前被線性化。優(yōu)選地進(jìn)行線性化使得克隆載體的至少一端(但是優(yōu)選任一端)側(cè)翼是與靶基因座同源的序列。靶基因座側(cè)翼的同源序列的長度優(yōu)選地至少0.1kb,進(jìn)一步優(yōu)選地至少0.2kb,還更優(yōu)選地至少0.5kb,進(jìn)一步更優(yōu)選地至少1kb,最優(yōu)選地至少2kb。載體系統(tǒng)可以是單個載體或質(zhì)粒,或者可以是兩個或多個載體或質(zhì)粒,其共同含有要被引入宿主細(xì)胞基因組中的總DNA。
DNA構(gòu)建體可在附加型載體上使用。優(yōu)選地,構(gòu)建體被整合進(jìn)宿主菌株的基因組中。
在另一個實施方案中,可通過從宿主菌株的基因組中缺失一個或多個限制普伐他汀產(chǎn)量的酶的內(nèi)源基因來改進(jìn)本發(fā)明的多肽的應(yīng)用。這類酶的例子為(但不限于)水解制甲羥酶素或普伐他汀側(cè)鏈的酶。
在一個優(yōu)選的實施方案中,可在生產(chǎn)制甲羥酶素的宿主細(xì)胞中表達(dá)cmpH基因(SEQ ID NO 1)、所有同源序列、與還原酶結(jié)構(gòu)域的cmpH融合物(SEQ ID NO 4)及編碼制甲羥酶素的所有功能等同物,以生產(chǎn)普伐他汀。在原核宿主的情況下,可在如上所述的這類宿主中應(yīng)用功能表達(dá)的所有方面。在真核宿主細(xì)胞的情況下,可優(yōu)選地使表達(dá)構(gòu)建體適應(yīng)這類宿主中的有效表達(dá)。優(yōu)選地,宿主細(xì)胞是真菌,更優(yōu)選地是絲狀真菌,最優(yōu)選地,真菌宿主細(xì)胞是生產(chǎn)他汀(優(yōu)選地為制甲羥酶素)的細(xì)胞。其例子為,但不限于Aspergillus物種(即Aspergillus terreus),或Penicillium物種(即Penicillium citrinum或chrysogenum)或Monascus物種(即Monascus ruber或paxii)。
對絲狀真菌細(xì)胞而言優(yōu)選的啟動子是本領(lǐng)域已知的,并且可以是例如葡萄糖-6-磷酸脫氫酶gpdA啟動子,蛋白酶啟動子如pepA、pepB、pepC,葡萄糖淀粉酶glaA啟動子,淀粉酶amyA、amyB啟動子,過氧化氫酶catR或catA啟動子,葡萄糖氧化酶goxC啟動子,β-半乳糖苷酶lacA啟動子,α-葡萄糖苷酶aglA啟動子,翻譯延伸因子tefA啟動子,木聚糖酶啟動子如xlnA、xlnB、xlnC、xlnD,纖維素酶啟動子如eglA、eglB、cbhA,轉(zhuǎn)錄調(diào)節(jié)子的啟動子如areA、creA、xlnR、pacC、prtT等或任何其它,并可在諸如NCBI站點(http://www.ncbi.nlm.nih.gov/entrez/)中找到。
在一個優(yōu)選的實施方案中,啟動子可衍生自被高度表達(dá)的基因(在本文中定義為mRNA濃度至少為總細(xì)胞mRNA的0.5%(w/w))。在另一個優(yōu)選的實施方案中,啟動子可衍生自被中度表達(dá)的基因(在本文中定義為mRNA濃度至少為總細(xì)胞mNRA的0.01%至0.5%(w/w))。在另一優(yōu)選的實施方案中,啟動子可衍生自被低表達(dá)的基因(在本文中定義為mRNA濃度低于總細(xì)胞mRNA的0.01%(w/w))。
在一個進(jìn)一步更優(yōu)選的實施方案中,使用微陣列數(shù)據(jù)選擇基因,并進(jìn)而選擇這些基因的啟動子,所述啟動子具有確定的轉(zhuǎn)錄水平和調(diào)節(jié)。藉此,可以使基因表達(dá)盒最適地適應(yīng)其應(yīng)當(dāng)發(fā)揮功能的條件。
控制序列還可以包括合適的轉(zhuǎn)錄終止子序列,這是被絲狀真菌細(xì)胞識別為終止轉(zhuǎn)錄的序列。終止子序列與編碼多肽的核酸序列的3’末端可操作地連接。在細(xì)胞中有功能的任何終止子都可用于本發(fā)明中。對絲狀真菌細(xì)胞而言優(yōu)選的終止子得自編碼Aspergillus oryzae TAKA淀粉酶、Aspergillus niger葡萄糖淀粉酶、Aspergillus nidulans鄰氨基苯甲酸合酶、Aspergillus niger α-葡萄糖苷酶、trpC基因和Fusarium oxysporum胰蛋白酶樣蛋白酶的基因。
控制序列也可以包括合適的前導(dǎo)序列,這是對絲狀真菌細(xì)胞翻譯重要的mRNA的非翻譯區(qū)。前導(dǎo)序列與編碼多肽的核酸序列的5’端可操作地連接。在細(xì)胞中有功能的任何前導(dǎo)序列可以用于本發(fā)明中。絲狀真菌細(xì)胞優(yōu)選的前導(dǎo)序列得自編碼Aspergillus oryzae TAKA淀粉酶和Aspergillusnidulans磷酸丙糖異構(gòu)酶和Aspergillus niger glaA的基因。
控制序列也可以包括多聚腺苷酸化序列,其與核酸序列的3’端可操作地連接,并且在轉(zhuǎn)錄后被絲狀真菌細(xì)胞識別為對經(jīng)轉(zhuǎn)錄的mRNA添加多聚腺苷殘基的信號。在細(xì)胞中有功能的任何多聚腺苷酸化序列可以被用于本發(fā)明中。對絲狀真菌細(xì)胞來說優(yōu)選的多聚腺苷酸化序列得自編碼下述的基因Aspergillus oryzae TAKA淀粉酶;Aspergillus niger葡萄糖淀粉酶;Aspergillus nidulans鄰氨基苯甲酸合酶;Fusarium oxysporum胰蛋白酶樣蛋白酶和Aspergillus niger α-葡萄糖苷酶。
核酸構(gòu)建體可以是表達(dá)載體。表達(dá)載體可以是任何載體(例如質(zhì)粒或病毒),其可便利地進(jìn)行重組DNA步驟并可導(dǎo)致編碼多肽的核酸序列的表達(dá)。載體的選擇應(yīng)典型地取決于載體與要引入載體的細(xì)胞的相容性。載體可以是線性的或閉合環(huán)狀質(zhì)粒。
載體可以是表達(dá)載體。表達(dá)載體可以是任何載體(例如質(zhì)粒或病毒),其可便利地進(jìn)行重組DNA步驟并可導(dǎo)致編碼多肽的核酸序列的表達(dá)。對載體的選擇應(yīng)典型地取決于載體與要引入載體的細(xì)胞的相容性。載體可以是線性的或閉合環(huán)狀質(zhì)粒。載體可以是自主復(fù)制的載體,即作為染色體外實體存在的載體,其復(fù)制不依賴于染色體的復(fù)制,例如質(zhì)粒、染色體外元件、小染色體或人工染色體。用于絲狀真菌的自主維持的克隆載體可包括AMA1-序列(見例如Aleksenko and Clutterbuck(1997),F(xiàn)ungalGenet.Biol.21373-397)?;蛘?,載體可以是下述載體,當(dāng)其被引入細(xì)胞時整合進(jìn)基因組中,并與其被整合在其中的染色體一起復(fù)制。整合型克隆載體可以隨機(jī)或在預(yù)先確定的靶基因座上整合進(jìn)宿主細(xì)胞的染色體中。優(yōu)選地,整合型克隆載體包括與宿主細(xì)胞基因組中預(yù)先確定的靶基因座中的DNA序列同源的DNA片段,用于將克隆載體的整合靶向該預(yù)先確定的基因座上。為了促進(jìn)定向整合,克隆載體優(yōu)選地在轉(zhuǎn)化宿主細(xì)胞前被線性化。優(yōu)選地進(jìn)行線性化使得克隆載體的至少一端(但是優(yōu)選任一端)側(cè)翼是與靶基因座同源的序列。靶基因座側(cè)翼的同源序列的長度優(yōu)選地至少0.1kb,進(jìn)一步優(yōu)選地至少0.2kb,還更優(yōu)選地至少0.5kb,進(jìn)一步更優(yōu)選地至少1kb,最優(yōu)選地至少2kb。載體系統(tǒng)可以是單個載體或質(zhì)粒,或者可以是兩個或多個載體或質(zhì)粒,其共同含有要被引入宿主細(xì)胞基因組中的總DNA。
DNA構(gòu)建體可在附加型載體上使用。優(yōu)選地,構(gòu)建體被整合進(jìn)宿主菌株的基因組中。
使用共轉(zhuǎn)化來轉(zhuǎn)化真菌細(xì)胞,即與感興趣的基因一起還轉(zhuǎn)化了可選擇的標(biāo)記物基因。其可以與感興趣的基因物理連接(即在質(zhì)粒上),或位于獨立的片段上。轉(zhuǎn)染后,針對該選擇標(biāo)記物基因的存在篩選轉(zhuǎn)化體,并隨后分析感興趣的基因的存在??蛇x擇的標(biāo)記物是提供針對殺生物劑或病毒的抗性、針對重金屬的抗性、針對營養(yǎng)缺陷型的原養(yǎng)型等等的產(chǎn)物。有用的可選擇標(biāo)記物包括amdS(乙酰胺酶)、argB(鳥氨酸氨甲?;D(zhuǎn)移酶)、bar(膦絲菌素?;D(zhuǎn)移酶)、hygB(潮霉素磷酸轉(zhuǎn)移酶)、niaD(硝酸鹽還原酶)、pyrG(乳清苷-5’-磷酸鹽脫羧酶)、sC或sutB(硫酸鹽腺嘌呤基轉(zhuǎn)移酶)、trpC(鄰氨基苯甲酸合酶)、ble(腐草霉素抗性蛋白質(zhì))或其等價物。
獲得的宿主細(xì)胞可被用于生產(chǎn)普伐他汀。
本發(fā)明第三方面提供了分離編碼下述多肽的多核苷酸的方法,所述多肽能夠促進(jìn)第二方面的制甲羥酶素到普伐他汀的轉(zhuǎn)化,所述方法包括步驟 (i)用本發(fā)明第一方面的多核苷酸轉(zhuǎn)化宿主細(xì)胞; (ii)針對其羥基化制甲羥酶素的能力選擇經(jīng)轉(zhuǎn)化的細(xì)胞的克?。? (iii)用多種多核苷酸再轉(zhuǎn)化這些經(jīng)分離的克??; (iv)針對其羥基化制甲羥酶素的能力選擇經(jīng)轉(zhuǎn)化的細(xì)胞的克隆; (v)分離質(zhì)粒; (vi)對所述質(zhì)粒插入物測序。
步驟(iii)的多種多核苷酸可得自若干種來源。其可以是基因組DNA、拷貝DNA、RNA半合成的或來自合成起源。其可來自真核或原核宿主。其可作為環(huán)狀或線性多核苷酸提供。其可以是特定的多核苷酸(即基因或基因家族或衍生自基因的易錯文庫),或其可以是隨機(jī)的多核苷酸(即宏基因組文庫(metagenomic library)或經(jīng)隨機(jī)消化的基因組DNA)。其可從其自身的啟動子表達(dá),或其可被克隆在在步驟(i)的宿主中有功能的啟動子之后。
也可通過例如篩選第一方面的多核苷酸供體微生物的基因組或cDNA文庫,來分離編碼促進(jìn)制甲羥酶素羥化酶活性的多肽的這類核酸序列。一旦檢測到與衍生自SEQ ID NO 2的探針同源的核酸序列,則可通過利用本領(lǐng)域常規(guī)技術(shù)人員已知的技術(shù)分離或克隆該序列或其周圍的DNA(見Sambrook et al.,1989,Molecular Cloning,A Laboratory Manual,2d edition,Cold Spring Harbor,New York)。
藉此,能夠克隆編碼具有增強(qiáng)的功能的多肽的變體多核苷酸,或編碼加速或促進(jìn)制甲羥酶素羥化酶功能的多肽的多核苷酸,或活化制甲羥酶素羥化酶基因之前的啟動子的多核苷酸。
在一個實施方案中公開了通過分離還原氧化再生體系并將其引入表達(dá)制甲羥酶素羥化酶的宿主細(xì)胞中,來改進(jìn)制甲羥酶素到普伐他汀的轉(zhuǎn)化效率的方法,所述還原氧化再生體系事實上是p450酶(Pylypenko andSchlichting,2004,Annu.Rev.Biochem.73991-1018)。在宿主細(xì)胞中引入這類體系的一般方法與引入制甲羥酶素羥化酶所述的方法相同,并在上文給出。這類還原氧化再生體系可得自下述物種,所述物種被引用為從中可獲得或在其中可異源表達(dá)第二方面多核苷酸的物種;其例子為,但不限于Streptomyces物種(即Streptomyces carbophilus、Streptomycesflavidovirens、Streptomyces coelicolor、Streptomyces lividans、Streptomycesexfoliatus、Streptomyces avermitilis、Streptomyces clavuligerus)或Amycolatopsis物種(即Amycolatopsis orientalis)或Bacillus species(即Bacillus subtilus、Bacillus amyloliquefaciens、Bacillus licheniformis)或Corynebacterium物種(即Corynebacterium glutamicum)或Escherichia物種(即Escherichia coli)。還可應(yīng)用備選的體系。備選體系的例子為,但不限于,將本發(fā)明的制甲羥酶素羥化酶整合在IV類p450體系中,從而使其與還原氧化配偶體融合(Roberts et al.,2002,J.Bacteriol.1843898-3908and Kubota et al.,2005,Biosci.Biotechnol.Biochem.692421-2430)或通過產(chǎn)NAD(P)H的并非與p450相連的酶如亞磷酸鹽脫氫酶(Johannes et al.,2005,Appl Environ Microbiol.715728-5734.)或通過非酶手段(Hollmann et al.,2006,Trends Biotechnol.24163-171)實現(xiàn)。
在本發(fā)明的第四方面中,根據(jù)第三方面的方法生產(chǎn)的普伐他汀被包含在藥物組合物中。
附例 圖1顯示了由cmpH基因產(chǎn)物——制甲羥酶素羥化酶催化的轉(zhuǎn)化。圖例[C]=制甲羥酶素;[P]=普伐他汀。
圖2顯示了質(zhì)粒pZERO-Ao-11H9。圖例ORF-1=第一開放讀碼框,ORF-2=第二開放讀碼框,zeo=編碼博萊霉素(zeocin)抗性的基因,kan=編碼卡納霉素抗性的基因。
圖3顯示了質(zhì)粒pZERO-Ao-11H9d。圖例ORF-1=第一開放讀碼框,zeo=編碼博萊霉素抗性的基因,kan=編碼卡納霉素抗性的基因。
圖4顯示了質(zhì)粒pACYC-taqScp450。圖例Sc-p450=編碼制甲羥酶素羥化酶p450的Streptomyces carbophilus基因,cat=編碼氯霉素抗性的基因。
圖5顯示了質(zhì)粒pACYC-taqAop450。圖例A0-cmpH=編碼制甲羥酶素羥化酶p450的Amycolatopsis orientalis基因,cat=編碼氯霉素抗性的基因。
實施例 一般方法 如其它地方所述,進(jìn)行標(biāo)準(zhǔn)的DNA步驟和原核生物培養(yǎng)(Sambrook,J.et al.,1989,Molecular cloninga laboratory manual,2nd Ed.,Cold SpringHarbor Laboratory Press,Cold Spring Harbor,New York)。使用保真酶Phusion聚合酶(Finnzymes)擴(kuò)增DNA。限制性酶來自Invitrogen或NewEngland Biolabs。通過將制甲羥酶素在乙醇中溶解至20mg/ml的終濃度,完成制甲羥酶素的水解。從4M儲存液中添加NaOH至0.1M的終濃度。將溶液在50℃加熱1到2小時,隨后冷卻至室溫。該溶液可在室溫下儲存3個月。通過將普伐他汀和未水解的制甲羥酶素二者以20mg/ml溶于乙醇中,制備其儲存液。
實施例1 篩選高效的全細(xì)胞制甲羥酶素到普伐他汀的生物轉(zhuǎn)化 測試不同的原核和真菌物種(表1),以分離具有改進(jìn)的轉(zhuǎn)化的物種,所述轉(zhuǎn)化來自于經(jīng)水解的制甲羥酶素。將所有的物種在25ml 2xYT培養(yǎng)基中預(yù)培養(yǎng)1-3天(取決于物種的生長率),洗滌并懸浮于25ml新鮮的2xYT培養(yǎng)基中。在280rpm和30℃下?lián)u動數(shù)小時的適應(yīng)周期后,以0.1、0.2、0.5和1mg/ml的終濃度添加經(jīng)水解的制甲羥酶素。孵育24小時后,通過將搖瓶的內(nèi)容物轉(zhuǎn)移進(jìn)50ml Greiner管中,收集發(fā)酵液。將樣品冷凍于-20℃,然后凍干。如下所述來提取他汀向凍干的樣品中添加1-2ml甲醇,然后重復(fù)振蕩。通過離心將固體與液相分離。將200μl甲醇提取物轉(zhuǎn)移進(jìn)HPLC管中,然后如下進(jìn)行HPLC分析 洗脫液AmilliQ水中33%乙腈,0.025%三氟乙酸 BmilliQ水中80%乙腈 梯度 時間(分鐘)洗脫液A% 洗脫液B% 0-8 100 0 8-8.1 100→0 0→100 8.1-120 100 12-13 0→100 100→0 13-14 100 0 柱Waters XTerra RP18(柱溫度=室溫) 流速 1ml/分鐘 注射體積 10μl;(支架溫度=室溫) 設(shè)備 Waters Alliance 2695 檢測器Waters 996光二極管陣列 波長 238nm 駐留時間 普伐他汀4分鐘,經(jīng)水解的制甲羥酶素10.4分鐘,制甲羥酶素10.9分鐘 表1針對經(jīng)水解的制甲羥酶素羥基化進(jìn)行測試的原核物種 如在表1中可以看到的,普伐他汀由測試組的四種物種合成Actinokineospora riparia、Pseudonocardia alni、Streptomyces carbophilus和Amycolatopsis orientalis。
實施例2 制甲羥酶素的生物水解 為了確定實施例1中所述物種是否也能夠水解和/或羥基化內(nèi)酯形式的制甲羥酶素,將所選擇的四種物種在25ml 2xYT培養(yǎng)基中預(yù)培養(yǎng)1-3天(取決于物種的生長速率),洗滌并重懸于25ml新鮮的2xYT培養(yǎng)基中。在280rpm和30℃下?lián)u動數(shù)小時的適應(yīng)周期后,以0.2mg/ml添加未水解的制甲羥酶素。孵育24小時后,通過將搖瓶的內(nèi)容物轉(zhuǎn)移進(jìn)50mlGreiner管中,收集發(fā)酵液。將樣品冷凍于-20℃,然后凍干。如下提取他汀向凍干的樣品中添加1-2ml甲醇,然后重復(fù)振蕩。通過離心將固體與液相分離。將200μl甲醇提取物轉(zhuǎn)移進(jìn)HPLC管中,然后如實施例1中所述進(jìn)行HPLC分析。所有四種物種(Actinokineospora riparia、Escherichiacoli、Streptomyces carbophilus和Amycolatopsis orientalis)水解制甲羥酶素,但是這不是普伐他汀形成的必要條件。Amycolatopsis orientalis是合成普伐他汀中最有效的物種。
表2針對制甲羥酶素的水解和/或羥基化進(jìn)行測試的原核物種 實施例3 Amycolatopsis orientalis具有非常高效的制甲羥酶素羥基化 從實施例1中概括出來,對于制甲羥酶素羥基化而言,Amycolatopsisorientalis優(yōu)于Streptomyces carbophilus。為了進(jìn)一步研究,將兩種物種均在25ml 2xYT培養(yǎng)基中預(yù)培養(yǎng)24小時,洗滌并重懸于25ml新鮮的2xYT培養(yǎng)基中。在280rpm和30℃下?lián)u動若干小時后,以0.1和0.2mg/ml添加經(jīng)水解的制甲羥酶素。孵育24小時后,通過將搖瓶內(nèi)容物轉(zhuǎn)移進(jìn)50mlGreiner管中收集發(fā)酵液。將樣品冷凍于-20℃,然后凍干。如下提取他汀向凍干的樣品中添加1-2ml甲醇,然后重復(fù)振蕩。通過離心將固體與液相分離。將200μl甲醇提取物轉(zhuǎn)移進(jìn)HPLC管中,然后如實施例1中所述進(jìn)行HPLC分析。如從表3中可以看出,Amycolatopsis orientalis能夠以100%的效率將制甲羥酶素轉(zhuǎn)化為普伐他汀,而Streptomyces carbophilus則不能。
表3Amycolatopsis orientalis和Streptomyces carbophilus的制甲羥酶素羥基化中的比較。
實施例4 分離編碼將制甲羥酶素轉(zhuǎn)化為普伐他汀的生物催化劑的基因片段基因文庫Amycolatopsis orientalis 在28℃下培養(yǎng)液體培養(yǎng)基(10g/l葡萄糖、5g/l酵母提取物、20g/l淀粉、1g/l CaCO3和0.5g/l水解酪蛋白氨基酸,帶擋板的燒瓶)中的Amycolatopsis orientalis菌落,直至OD=2.0。部分被用于制備甘油儲存液,部分被用于接種(1/50的比例)含50ml液體培養(yǎng)基的燒瓶,以制備用于基因組DNA分離的細(xì)胞。28℃下16小時后,將培養(yǎng)物用于分離基因組DNA。另外,在孵育的最后一個小時添加氨芐西林至200μg/ml的終濃度。通過離心(8000rpm下15分鐘)收獲細(xì)胞并將沉淀物重懸于用50mM EDT調(diào)節(jié)至pH 8.0的5ml 50mM Tris-HCl中。添加100μl溶菌酶(100mg/ml)和40μl蛋白水解酶K(20mg/ml)后,將懸浮液在37℃孵育30分鐘。添加Promega的核裂解溶液(6ml)。在80℃孵育15分鐘和在65℃孵育30分鐘導(dǎo)致幾乎全部細(xì)胞裂解。核糖核酸酶處理(10μl 100mg/ml核糖核酸酶溶液)后,添加2ml Promega的蛋白質(zhì)沉淀溶液,將混合物振蕩(20秒)并在冰上孵育(15分鐘)。離心(5000rpm下15分鐘)后,將上清液與0.1體積的NaAc(3M,pH 5)和2體積的EtOH(96%)混合。用巴斯德吸管轉(zhuǎn)移沉淀的基因組DNA的可見復(fù)合物,并溶于500μl 10mMTris(pH 8.0)中。進(jìn)行第二次蛋白酶K處理(每200μl樣品使用10μl 20mg/ml儲存溶液,然后在37℃下孵育30分鐘),以去除剩余的蛋自質(zhì)。在蛋白酶K步驟后,添加500μl苯酚/氯仿/異戊醇(PCI,25∶24∶1)并將混合物在14,000rpm下離心5分鐘。將上部相轉(zhuǎn)移至新管中,并添加500μlPCI(24∶1)以去除痕量的苯酚。通過離心分離各相,并將上層相與0.1體積的NaAc(3M,pH 5)和2體積的EtOH(96%)混合,以沉淀DNA。用吸管取出基因組DNA,用70%冷EtOH沖洗并溶于500μl Tris-EDTA緩沖液中。這得到134μg經(jīng)純化的基因組DNA,其具有1.85的A260nm/A280nm。使用Sau3AI(0.067單位/μg DNA)部分消化經(jīng)分離的Amycolatopsisorientalis DNA,以獲得范圍在4到10kb之間的更小片段。消化基因組DNA(50μg),使用Qiagen QIAquick提取試劑盒從制備的0.6%瓊脂糖凝膠中分離4和10kb之間的片段,最終溶于20μl 10mM Tris,pH 8.0中。將這些片段與經(jīng)BamHI消化的pZErO-2(Invitrogen)連接,并轉(zhuǎn)化進(jìn)Escherichia coli DH10B中,得到約39,000個菌落。使用二十個個體菌落接種10ml 2xYT培養(yǎng)基,以檢驗文庫的多樣性并確定平均插入物大小。19個質(zhì)粒含有不同大小的插入物,而一個菌落不具有插入物(自連的載體5%)。pZErO-2中g(shù)DNA片段的平均插入物大小為與3.8kb。從平板收集所有獲得的轉(zhuǎn)化體并重懸于含卡納霉素的液體2xYT培養(yǎng)基中,添加甘油至8%(v/v)的終濃度并儲存于-80℃下。
篩選Amycolatopsis orientalis基因文庫 將Amycolatopsis orientalis基因文庫涂布在2xYT瓊脂+卡納霉素(50mg/L)上,并在室溫下孵育72小時。幾乎12,000個菌落被用于接種含0.2ml 2xYT培養(yǎng)基+35mg/L卡納霉素的120個96孔微量滴定板(MTPs)。將MTP在25℃下用500rpm孵育48小時。將來自每個孔的140μl細(xì)胞懸浮液在3,000rpm離心10分鐘,并通過將平板在面巾紙上輕叩棄去上清液(將培養(yǎng)物的剩余部分添加至50μl 20%甘油中并儲存于-80℃下)。每孔250μl底物溶液(2xYT培養(yǎng)基,含經(jīng)水解的制甲羥酶素,200mg/L;葡萄糖,2g/l;磷酸鹽緩沖液,50mM;pH 6,8)。重懸孔中的細(xì)胞沉淀物并在30℃,280rpm下孵育48小時。在每孔添加0.35ml甲醇并在280rpm下混合1小時后提取他汀。通過在2750rpm離心15分鐘去除細(xì)胞碎片。通過LC-MS分析100μl樣品。
對Amycolatopsis orientalis基因文庫的LC-MS分析 通過在氦氣氛2下攪拌過夜,在MeOH中用1.5M NaOH水解美伐他汀(A.G.Scientific,目錄號A7413,純度99.36%)(1∶2),制備制甲羥酶素標(biāo)準(zhǔn)。通過添加HCl(4M)降低pH,并用水進(jìn)一步稀釋標(biāo)準(zhǔn)。用ACT(Advanced Chromatography Technologies)的短(20mm)CN柱在WatersLC/MS體系上分析樣品,使用水和乙腈(均含0.1%甲酸)作為流動相。LC-部分的細(xì)節(jié)為 裝置 Waters Alliance 2795 LC 流動相溶劑A含0.1%甲酸的水 溶劑B含0.1%甲酸的乙腈 針洗滌劑 50%Milli-Q水+50%乙腈 梯度時間表時間(分鐘) A% B% 流速(ml/分鐘) 曲線 0.0080.0 20.0 1.001 0.3580.0 20.0 1.006 1.0020.0 80.0 1.006 1.4020.0 80.0 1.006 1.5080.0 20.0 1.006 2.00結(jié)束 柱ACT,ACE 3 CN,20x2.1mm,顆粒大小3μm 柱溫度25℃ 注射體積 5μl 在MS電噴射離子化中,使用陽性模式(ES+)并且化合物被分析為與三種化合物所選離子(SIR)的鈉加合物([M+Na]+)。MS-部分的細(xì)節(jié)為 裝置 Waters ZQ 2000 來源ES+ 毛細(xì)管3.50kV 錐體(cone) 30V 去溶劑化溫度 360℃來源溫度140℃ 提取器(Extractor) 2V RF透鏡 0.3V 錐體氣流 130l/小時去溶劑化氣流610l/小時 LM1重溶液 15.0 HM2重溶液 15.0 離子能量1 0.1倍增器(Multiplier)650V 掃描質(zhì)量范圍(m/z)200-600amu,掃描持續(xù)時間0.20s,掃描間延遲0.05s;3個頻道的SIR413.30、431.3、447.3;暫停(dwell)0.07s;掃描間延遲0.05s。
制甲羥酶素(內(nèi)酯) 制甲羥酶素(酸,經(jīng)水普伐他汀(β-變體) 解的形式) MW=390.24MW=408.25MW=424.25 [M+Na]+=413.3[M+Na]+=431.3[M+Na]+=447.3 在該設(shè)置下,可能區(qū)分四種最重要的分子——制甲羥酶素、經(jīng)水解的制甲羥酶素和6-羥基-制甲羥酶素的兩種立體異構(gòu)體,即β-變體普伐他汀(結(jié)構(gòu)見上文)和α-變體普伐他汀。
表4制甲羥酶素羥化酶MTP篩選結(jié)果的例子 若干個克隆被鑒定為候選者,因為它們在普伐他汀的位置上顯示小的卻顯著的峰。表4中顯示了這些結(jié)果的一個子集作為例子,其中一個克隆給出了高于背景的信號(孔位置B1中的克隆)。
實施例5 鑒定編碼制甲羥酶素到普伐他汀的生物催化劑的基因?qū)碜訟mycolatopsis orientalis的推定的制甲羥酶素羥化酶進(jìn)行再測試 用在第一輪鑒定為推定克隆的四種克隆重復(fù)實施例4的分析。然而,這次在搖瓶而不是MTP中培養(yǎng)克隆,并且改變一些培養(yǎng)條件。將克隆在10ml 2xYT中預(yù)培養(yǎng)并在30℃、280rpm下培養(yǎng)24小時,所述10ml2xYT用含有推定的制甲羥酶素羥化酶的Escherichia coli細(xì)胞接種。隨后添加0.1-0.5mM IPTG和0.5mM δ-氨基乙酰丙酸鹽,并將培養(yǎng)物在22℃、280rpm下孵育12小時。收獲細(xì)胞,洗滌并通過振蕩重懸于新鮮的2xYT培養(yǎng)基(補(bǔ)充有經(jīng)羥基化的制甲羥酶素,200mg/L;葡萄糖,2g/l;磷酸鹽緩沖液,50mM;pH 6,8)中。將細(xì)胞懸浮液在30或37℃下,于280rpm下孵育24或48小時。如實施例4中所述提取他汀并進(jìn)行分析。
表5來自A.orientalis的推定的制甲羥酶素羥化酶再測試的結(jié)果。
如從表5中可以看出,只有克隆11H9具有真實的制甲羥酶素到普伐他汀的顯著轉(zhuǎn)化。因此選擇所述克隆用于進(jìn)一步的分析。
測序和序列分析 在含卡納霉素的2xYT中培養(yǎng)Escherichia coli克隆11H9,使用QiagenQIAprepep試劑盒分離質(zhì)粒DNA,并測定pZERO-2質(zhì)粒中Amycolatopsisorientalis基因組插入物的序列。插入物的序列為2545個核苷酸長(見SEQ ID NO.1)。進(jìn)行DNA序列分析并鑒定了兩個開放讀碼框(ORF)(圖2)。第一個ORF編碼401個氨基酸的推定蛋白質(zhì)(SEQ ID no.2和3),其與已知的p450酶具有一些同源性(即最好是來自Streptomycestubercidicus的細(xì)胞色素p450單氧合酶CYP105S2)。第二個ORF具有一些跨膜區(qū),并可編碼ATP-型結(jié)合盒(ABC)蛋白質(zhì)。
鑒定結(jié)構(gòu)基因cmpH 為了鑒定能夠羥基化制甲羥酶素的結(jié)構(gòu)基因,通過用SalI和XhoI雙重消化,從pZERO-Ao-11H9中缺失ORF-2。隨后分離4.9kb片段并自連。得到的質(zhì)粒pZERO-Ao-11H9僅含有ORF-1作為完整的ORF(圖3)。該克隆具有與克隆pZERO-Ao-11H9相同的轉(zhuǎn)化率,表明ORF-1編碼功能性制甲羥酶素羥化酶,稱作cmpH。
比較實施例6 Streptomyces carbophilus制甲羥酶素羥化酶在Escherichia coli中的活性構(gòu)建p450-SCA E.coli表達(dá)克隆 使用SEQ ID NO.7和SEQ ID NO.8的引物,從分離自菌株FERM-BP1145的基因組DNA中PCR擴(kuò)增編碼Streptomyces carbophilus p450的基因。根據(jù)供應(yīng)商(Invitrogen)的說明在pCR2.1TOPO/TA載體中克隆PCR片段。如下構(gòu)建表達(dá)克隆用Acc65I消化pACYC-taq(
M.,2000.Untersuchungen zum Einfluss
Bereitstellung von Erythrose-4-Phosphatund Phosphoenolpyruvat auf den Kohlenstofffluss in denAromatenbiosyntheseweg von Escherichia coli.Berichte desForschungszentrums Jülich 3824,ISSN 0944-2952,PhD Thesis,University ofDüsseldorf)并連接分離自pCR2.1TOPO/TA載體的Acc65I片段,得到pACYC-taqScp450(圖4)。
Escherichia coli提取物中的活性測定 在含氯霉素的10ml 2xYT中培養(yǎng)含pACYC-taqScp450的Escherichiacoli細(xì)胞。大致加工細(xì)胞懸浮液并如實施例5中所述用制甲羥酶素孵育。在該情況下,培養(yǎng)溫度為37℃,并向反應(yīng)混合物中添加氯霉素和IPTG(0.1mM)。將反應(yīng)在30℃和220rpm下孵育。在不同的時間點取樣,并如實施例1所述使用HPLC方案分析。24小時后未檢測到普伐他汀。
實施例7 Amycolatopsis orientalis制甲羥酶素羥化酶在Escherichia coli中的活性構(gòu)建Ao-cmpH Escherichia coli表達(dá)克隆 使用SEQ ID NO.9和SEQ ID NO.10的引物,從經(jīng)分離的基因組DNA中PCR擴(kuò)增編碼Amycolatopsis orientalis p450的基因。根據(jù)供應(yīng)商(Invitrogen)的說明在pCR2.1TOPO/TA載體中克隆PCR片段。如下構(gòu)建表達(dá)克隆用Acc65I消化pACYC-taq(
2000)并連接分離自pCR2.1TOPO/TA載體的Acc65I片段,得到pACYC-taqAop450(圖5)。
Escherichia coli提取物中的活件測定 在含氯霉素的2xYT(10ml)中培養(yǎng)含pACYC-taqAop450(圖5)的Escherichia coli細(xì)胞。如實施例5中所述用制甲羥酶素孵育細(xì)胞懸浮液。培養(yǎng)溫度為37℃,并添加氯霉素和IPTG(0.1mM)。將反應(yīng)在30℃和220rpm下孵育。在不同的時間點取樣,并如實施例1所述使用HPLC方案分析。24小時后檢測到大的普伐他汀峰。該結(jié)果清楚地證明Amycolatopsisorientalis p450酶比其它p450更適合在Escherichia coli中羥基化制甲羥酶素得多。
表6帶有P450基因的Escherichia coli菌株的制甲羥酶素轉(zhuǎn)化。數(shù)據(jù)總結(jié)了50個測試克隆的平均轉(zhuǎn)化比例,每個克隆給出至少90%的制甲羥酶素到普伐他汀的轉(zhuǎn)化。如下所示,SEQ ID NO 11-18、27-34編碼的所有DNA片段催化非常相似的制甲羥酶素轉(zhuǎn)化特征。百分比是指被轉(zhuǎn)化的制甲羥酶素。
實施例8 Amycolatopsis orientalis制甲羥酶素羥化酶的衍生物在Escherichia coli中的活性 合成生產(chǎn)基因SEQ ID NO 11-18、27-34并用作PCR反應(yīng)的模板,所述PCR反應(yīng)使用添加了attB1(針對SEQ ID NO 9)和attB2(針對SEQ ID NO 10)重組位點的寡核苷酸SEQ ID NO 9和10。通過進(jìn)行Gateway BP反應(yīng)(Invitrogen Corporation)將PCR片段克隆進(jìn)pDONR221載體(Invitrogen Corporation,荷蘭)中;通過DNA測序驗證序列以排除PCR相關(guān)的錯誤。使用Gateway LR反應(yīng),將基因從pDONR221載體轉(zhuǎn)移至pET-DEST42載體,得到最終的表達(dá)載體pET-DEST42-P450。在含有卡納霉素的10ml 2xYT中于30℃下培養(yǎng)帶有pET-DEST42-P450(SEQ ID NO 11-18、27-34作為插入物)的Escherichia coli BL21 DE3,直至OD600=0.5-1.0。隨后對培養(yǎng)物補(bǔ)充0.1-0.3mM IPTG和0.5mMδ-氨基乙酰丙酸鹽,并在22℃和280rpm下孵育12小時。收獲細(xì)胞,洗滌并重懸于新鮮的2xYT培養(yǎng)基(補(bǔ)充經(jīng)水解的制甲羥酶素,200mg/L;葡萄糖,2g/l;磷酸鹽緩沖液,50mM;pH 6.8)中。將細(xì)胞懸浮液在30℃或37℃下于280rpm下孵育24或48小時。如實施例4中所述提取和分析他汀。24小時后,可識別一個非常大的普伐他汀峰。與實施例7中所述實驗相反,大部分產(chǎn)生的普伐他汀是β-變體,給出下述證據(jù)如果分別與SEQ ID NO 3或SEQ ID NO 6編碼的制甲羥酶素羥化酶比較,則SEQ IDNO 19-26和SEQ ID NO 35-42的酶的立體專一性顯著改變。
表7帶有P450基因的Escherichia coli菌株的制甲羥酶素轉(zhuǎn)化。數(shù)據(jù)總結(jié)了50個測試克隆的平均轉(zhuǎn)化比例,每個克隆給出至少90%的制甲羥酶素到普伐他汀的轉(zhuǎn)化。如下所示,SEQ ID NO 11-18、27-34編碼的所有DNA片段催化非常相似的制甲羥酶素轉(zhuǎn)化特征。百分比是指被轉(zhuǎn)化的制甲羥酶素。
序列表
<110>帝斯曼知識產(chǎn)權(quán)資產(chǎn)管理有限公司
<120>制備普伐他汀的工藝
<130>25590WO
<150>EP06126046.9
<151>2006-12-13
<160>59
<170>PatentIn version 3.2
<210>1
<211>2545
<212>DNA
<213>Amycolatopsis orientalis
<400>1
gatctctacc tcgcgctggc gaacgacacg gactgactag ccccgcggcg ggttgaagat 60
catcgattcc gggttgatct gcttggcctt ctccatcagg ccgtactgcg tcatcaggtc 120
gatcacccgc tgcatgcgca ccgggctcat cgcggtcggc cacgtgccga ggcgcatcag 180
gctgacggtg tccttgtcca ctttggcgta gctggtcacg gtctgctcga ccaggctgcg 240
gttggccgcg tcccgctgtc ccttgacgat cgcccgctgg aacgccgccg tggtcttcgg 300
gttctcctgt gcgtacttgg cgctggtcgc ccacaccgcg atcggcacgt ccaaagtggc 360
cccggtcgcc gcgtccagca ccggcagcat cccggccttg cgctgcgcct gggtgatgta 420
cggctccacc atgaacgcgg cgtcgacgtt cttgcgctcg atggccgcct gcatgtccgg 480
gaacgggatc tcggtgaagg tcaccgtctt gatgtccaca ccgttggcct cgagcgcgga 540
ccgtgcggtc agttcgacga tgttcgcctt ggtgttgatc gcgatcttct tgccggccag 600
gtcggctggc ttggtgatcg cgttgtcctt gccggtcagg atcaggaaca tgccctgagc 660
ggcctggtag gcgtccgcga ccagcttgat gtccagcacg ttcttgtact gcgcggtgaa 720
gaacgagacg tagttgccga atgcgaactg cagttcgccg ttggcaaggc cgggcaccgc 780
cgccgcgccg cccggcagcg acttcagttc gacatcaagg ccttcctggg tgaagtagcc 840
tttctgctgt gcgatggcca gcggtacggt gtccacaatg ggcaacgtgc cgacgaccac 900
tttggtctgc tccaaaccgc ctgtctggtt gggcttttcg gtatccccac ccagtgccga 960
acagctggcg gcggcgaggg cgaggacgca ggacagggct atgcgccacg gacgggcgag 1020
tgacatgcgg ggttctcctg gcaggcaaga cacgatgatc tgggccggat cagaccacat 1080
cgtccctgtt caacgccagt cgacggaccc tactttcgac tgaaatatgc cagagctcac 1140
tcgttaagtg gcacgaatgt gctatgcatc ccatgcaacg ggcagcgccc aaaccccgta 1200
cgccgcggac ttctcgcgca gcttgatctc ctcggccggc accgcgagcc gcagcgacgg 1260
gaacctcgcg aacagccggg tgaagccgat gcgcatctcc actctggcca gttgctggcc 1320
gaggcattgg tgtataccac cgccgaacgc ggcgtgcttg cgtgcgtcca ctctgtccag 1380
ttgaaggata tcgggttcgt cgaacacctt cgggtcccta ttgaccgcgg gcagtccgat 1440
cgcgacagtg tcgcccttcc tgatcatctg gccttcgagc tccacgtcct ccagcgccgc 1500
ccggttgggc gttccaaggt ggacgatcga gaggtagcgc agcagttcct ccaccgcgtc 1560
cgggctgtcc agggcagcga tctgctccgg atgctgaagg agcgcgaaag tgcctaatcc 1620
caacatgttc gcggtggtct cgtgcccggc gacgagcaaa agcaacgcga tgttcgtcag1680
ctcttcatcg gtcagatcgg tgtccgtgat caagctgcca agcaggtcgt ccttggggct1740
caaccgcttc gtggcgacca gttcagcgat gtagcgggtg agtttgccaa gcgccgtcgt1800
cacctcatcc tgtgtcttgt ccacactggc catgatcgtg gtctgctctt ggaagaacgc1860
gtgatcggca tacgagacgc ccagcagctc gcagatcacc agcgaaggca ctggcaacgc1920
gaacgcctgc accagatcga ccggcggtcc tgctttggcc atcgcgtcga ggtggtcctc1980
ggtgatctgg acgatccgcg gttcgagttc cttgatccgt cgcacggtga actggctgat2040
cagcatccgg cggtaacgcg tgtgctcggg tgcgtccatg ttgatgaacc agccgggcgc2100
cggggctttc gtggctccgc ctggtcgtgg gatgacgctg aacaccgggt gcttgtgctc2160
ggggcggttg ctgaagcgcg gatcgatcat gacagtccgc gctgcggcat ggctggtcac2220
cagccagccg atgtggccgt cagggaatcg catcgggctc actggcggaa gtttcaccag2280
gtcaggcggt gggtcgaagg ggtagcccac ggctcggccg gtcggtagtg tcactggctc2340
gttcatattt tcggagtcta ctctcatttg atggtggact gtcaaagaag agagttctcc2400
ggtgtgcagt tatcctgctg cggtggatca accagcgact gggttgcggg aacgcaagaa2460
ggccagaacg aagaccgcca tccagcagca cgcgctgcgg ctgttcaagg agcacggcta2520
ccaggccacc acggtcgagc agatc 2545
<210>2
<211>1206
<212>DNA
<213>Amycolatopsis orientalis
<220>
<221>CDS
<222>(1)..(1206)
<400>2
atg aga gta gac tcc gaa aat atg aac gag cca gtg aca cta ccg acc48
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
ggc cga gcc gtg ggc tac ccc ttc gac cca ccg cct gac ctg gtg aaa96
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
ctt ccg cca gtg agc ccg atg cga ttc cct gac ggc cac atc ggc tgg144
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
ctg gtg acc agc cat gcc gca gcg cgg act gtc atg atc gat ccg cgc192
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
ttc agc aac cgc ccc gag cac aag cac ccg gtg ttc agc gtc atc cca240
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
cga cca ggc gga gcc acg aaa gcc ccg gcg ccc ggc tgg ttc atc aac288
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
atg gac gca ccc gag cac acg cgt tac cgc cgg atg ctg atc agc cag336
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
ttc acc gtg cga cgg atc aag gaa ctc gaa ccg cgg atc gtc cag atc384
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
acc gag gac cac ctc gac gcg atg gcc aaa gca gga ccg ccg gtc gat 432
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130135 140
ctg gtg cag gcg ttc gcg ttg cca gtg cct tcg ctg gtg atc tgc gag 480
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
ctg ctg ggc gtc tcg tat gcc gat cac gcg ttc ttc caa gag cag acc 528
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
acg atc atg gcc agt gtg gac aag aca cag gat gag gtg acg acg gcg 576
Thr Ile Met Ala Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
ctt ggc aaa ctc acc cgc tac atc gct gaa ctg gtc gcc acg aag cgg 624
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
ttg agc ccc aag gac gac ctg ctt ggc agc ttg atc acg gac acc gat 672
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
ctg acc gat gaa gag ctg acg aac atc gcg ttg ctt ttg ctc gtc gcc 720
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
ggg cac gag acc acc gcg aac atg ttg gga tta ggc act ttc gcg ctc 768
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
ctt cag cat ccg gag cag atc gct gcc ctg gac agc ccg gac gcg gtg 816
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
gag gaa ctg ctg cgc tac ctc tcg atc gtc cac ctt gga acg ccc aac 864
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
cgg gcg gcg ctg gag gac gtg gag ctc gaa ggc cag atg atc agg aag 912
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
ggc gac act gtc gcg atc gga ctg ccc gcg gtc aat agg gac ccg aag 960
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
gtg ttc gac gaa ccc gat atc ctt caa ctg gac aga gtg gac gca cgc 1008
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
aag cac gcc gcg ttc ggc ggt ggt ata cac caa tgc ctc ggc cag caa 1056
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
ctg gcc aga gtg gag atg cgc atc ggc ttc acc cgg ctg ttc gcg agg 1104
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
ttc ccg tcg ctg cgg ctc gcg gtg ccg gcc gag gag atc aag ctg cgc 1152
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
gag aag tcc gcg gcg tac ggg gtt tgg gcg ctg ccc gtt gca tgg gat 1200
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
gca tag 1206
Ala
<210>3
<211>401
<212>PRT
<213>Amycolatopsis orientalis
<400>3
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
15 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Ala Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>4
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>4
atgcgtgtcg actccgaaaa catgaacgag cctgtgaccc tccccaccgg ccgtgccgtg 60
ggctacccct tcgaccctcc tcctgacctg gtgaagcttc ctcccgtgag ccccatgcgc 120
ttccctgacg gccacatcgg ctggctggtg accagccacg ccgctgcgcg tactgtcatg 180
atcgatcccc gcttcagcaa ccgccccgag cacaagcacc ctgtgttcag cgtcatcccc 240
cgccccggcg gagccactaa ggcccccgcg cccggctggt tcatcaacat ggacgccccc 300
gagcacaccc gttaccgccg catgctgatc agccagttca ccgtgcgccg tatcaaggaa 360
ctcgaacctc gtatcgtcca gatcaccgag gaccacctcg acgcgatggc caaggctgga 420
cctcctgtcg atctggtgca ggcgttcgcg ttgcctgtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtacgc cgatcacgcg ttcttccagg agcagaccac catcatggcc 540
tccgtggaca agactcagga tgaggtgacc accgcgcttg gcaagctcac ccgctacatc 600
gctgaactgg tcgccactaa gcgtttgagc cccaaggacg acctgcttgg cagcttgatc 660
actgacaccg atctgaccga tgaagagctg accaacatcg cgttgctttt gctcgtcgcc 720
ggtcacgaga ccaccgcgaa catgttggga ctcggcactt tcgcgctcct tcagcacccc 780
gagcagatcg ctgccctgga cagccccgac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacccc caaccgtgcg gcgctggagg acgtggagct cgaaggccag 900
atgatccgca agggcgacac tgtcgcgatc ggactgcccg cggtcaaccg tgaccccaag 960
25590WO Sequence Listing.ST25.txt
gtgttcgacg aacccgatat ccttcagctg gaccgtgtgg acgctcgcaa gcacgccgcg1020
ttcggcggtg gtattcacca gtgcctcggc cagcagctgg cccgtgtgga gatgcgcatc1080
ggcttcaccc gtctgttcgc gcgcttcccc tcgctgcgtc tcgcggtgcc cgccgaggag1140
atcaagctgc gcgagaagtc cgcggcgtac ggtgtttggg cgctgcccgt tgcttgggat1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>5
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<220>
<221>CDS
<222>(1)..(2199)
<400>5
atg cgt gtc gac tcc gaa aac atg aac gag cct gtg acc ctc ccc acc 48
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
ggc cgt gcc gtg ggc tac ccc ttc gac cct cct cct gac ctg gtg aag 96
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
ctt cct ccc gtg agc ccc atg cgc ttc cct gac ggc cac atc ggc tgg 144
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
ctg gtg acc agc cac gcc gct gcg cgt act gtc atg atc gat ccc cgc 192
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
ttc agc aac cgc ccc gag cac aag cac cct gtg ttc agc gtc atc ccc 240
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
cgc ccc ggc gga gcc act aag gcc ccc gcg ccc ggc tgg ttc atc aac 288
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
atg gac gcc ccc gag cac acc cgt tac cgc cgc atg ctg atc agc cag 336
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
ttc acc gtg cgc cgt atc aag gaa ctc gaa cct cgt atc gtc cag atc 384
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
acc gag gac cac ctc gac gcg atg gcc aag gct gga cct cct gtc gat 432
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
ctg gtg cag gcg ttc gcg ttg cct gtg cct tcg ctg gtg atc tgc gag 480
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
ctg ctg ggc gtc tcg tac gcc gat cac gcg ttc ttc cag gag cag acc 528
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
acc atc atg gcc tcc gtg gac aag act cag gat gag gtg acc acc gcg 576
Thr Ile Met Ala Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
ctt ggc aag ctc acc cgc tac atc gct gaa ctg gtc gcc act aag cgt 624
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
ttg agc ccc aag gac gac ctg ctt ggc agc ttg atc act gac acc gat 672
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
ctg acc gat gaa gag ctg acc aac atc gcg ttg ctt ttg ctc gtc gcc 720
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
ggt cac gag acc acc gcg aac atg ttg gga ctc ggc act ttc gcg ctc 768
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
ctt cag cac ccc gag cag atc gct gcc ctg gac agc ccc gac gcg gtg 816
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
gag gaa ctg ctg cgc tac ctc tcg atc gtc cac ctt gga acc ccc aac 864
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
cgt gcg gcg ctg gag gac gtg gag ctc gaa ggc cag atg atc cgc aag 912
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
ggc gac act gtc gcg atc gga ctg ccc gcg gtc aac cgt gac ccc aag 960
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
gtg ttc gac gaa ccc gat atc ctt cag ctg gac cgt gtg gac gct cgc 1008
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
aag cac gcc gcg ttc ggc ggt ggt att cac cag tgc ctc ggc cag cag 1056
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
ctg gcc cgt gtg gag atg cgc atc ggc ttc acc cgt ctg ttc gcg cgc 1104
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
ttc ccc tcg ctg cgt ctc gcg gtg ccc gcc gag gag atc aag ctg cgc 1152
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
gag aag tcc gcg gcg tac ggt gtt tgg gcg ctg ccc gtt gct tgg gat 1200
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
gcc tct agt gtg ctg cac cgt cac cag cct gtc acc atc gga gaa ccc 1248
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
gcc gcc cgt gcg gtg tcc cgc acc gtc acc gtc gag cgc ctg gac cgt 1296
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
atc gcc gac gac gtg ctg cgc ctc gtc ctg cgc gac gcc ggc gga aag 1344
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
act ctc ccc act tgg act ccc ggc gcc cac atc gac ctc gac ctc ggc 1392
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
gcg ctg tcg cgc cag tac tcc ctg tgc ggc gcg ccc gat gcg cct agc 1440
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
tac gag att gcc gtg cac ctg gat ccc gag agc cgc ggc ggt tcg cgc 1488
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
tac atc cac gaa cag ctc gag gtg gga agc cct ctc cgt atg cgc ggc 1536
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
cct cgt aac cac ttc gcg ctc gac ccc ggc gcc gag cac tac gtg ttc 1584
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
gtc gcc ggc ggc atc ggc atc acc cct gtc ctg gcc atg gcc gac cac 1632
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
gcc cgc gcc cgt gga tgg agc tac gaa ctg cac tac tgc ggc cgt aac 1680
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
cgt tcc ggc atg gcc tac ctc gag cgt gtc gcc ggt cac ggt gac cgt 1728
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
gcc gcc ctg cac gtg tcc gag gaa ggc acc cgt atc gac ctc gcc gcc 1776
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
ctc ctc gcc gag ccc gcc ccc ggc gtc cag atc tac gcg tgc ggt gcc 1824
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
ggt cgt ctg ctc gcc gga ctc gag gac gcg agc cgt aac tgg ccc gac 1872
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
ggt gcg ctg cac gtc gag cac ttc acc tcg tcc ctc gcg gcg ctc gat 1920
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
cct gac gtc gag cac gcc ttc gac ctc gaa ctg cgt gac tcg ggt ctg 1968
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
acc gtg cgt gtc gaa ccc acc cag acc gtc ctc gac gcg ttg cgc gcc 2016
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
aac aac atc gac gtg ccc agc gac tgc gag gaa ggc ctc tgc ggc tcg 2064
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
tgc gag gtc gcc gtc ctc gac ggc gag gtc gac cac cgc gac act gtg 2112
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
ctg acc aag gcc gag cgt gcg gcg aac cgt cag atg atg acc tgc tgc 2160
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
tcg cgt gcc tgc ggc gac cgt ctg gcc ctg cgt ctc taa 2199
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>6
<211>732
<212>PRT
<213>人工
<220>
<223>合成DNA
<400>6
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Ala Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>7
<211>33
<212>DNA
<213>人工
<220>
<223>合成引物
<400>7
gggggtacca tggccgagat gacagagaaa gcc33
<210>8
<211>30
<212>DNA
<213>人工
<220>
<223>合成引物
<400>8
gggggtacct caccaggtga ccgggagttc30
<210>9
<211>40
<212>DNA
<213>人工
<220>
<223>合成引物
<400>9
gggggtacca tgagagtaga ctccgaaaat atgaacgagc 40
<210>10
<211>30
<212>DNA
<213>人工
<220>
<223>合成引物
<400>10
gggggtaccc tatgcatccc atgcaacggg 30
<210>11
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>11
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg180
atcgatccgc gcttcagcaa ccgctccgag cacaagcacc cggtgttcag cgtcatccca240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgtta540
agtgtggaca agacacagga tgaggtgacg acagcgcttg gcaaactcac ccgctacatc600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc720
gggcacgaga ccaccgcgaa catgctggga ttaggcactt tcgcgctcct tcagcatccg780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>12
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>12
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gaccgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tctttaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgctttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgccctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>13
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>13
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcaccaacat ggacgcaccc300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgtgctcct tcagcacccg780
gagcagatcg ctcttctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat1200
gcatag 1206
<210>14
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>14
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga120
tttcctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcgccaacat ggacgcaccc300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag480
ctgctgggcg tctcgtatgc tgatcacgcg ttcttccaag aacagaccac gatcatgttt540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc600
gctgaactgg tcgccacgaa gcggttgagc cctaaggacg acctgcttgg cagcttgatc660
acggacaccg atctgaccga tgaagagctg acgaacaccg cgttgctttt gctcgtcgcc720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg780
gagcagatcg ctgccctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggagcgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>15
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>15
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcaccaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtccg gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctaatctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>16
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>16
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc tgatcacgcg ttcttccaag agcagaccac gatcatgctg 540
agtgtggaca agacacagga taaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>17
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>17
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa360
ctcgaaccgc ggatcgtcca gatcaccgag gaccgcctcg acgcgatggc caaagcagga420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgctt540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acatggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcctcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>18
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>18
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctaccccc tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgttg 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg cttgcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>19
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>19
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Ser Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>20
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>20
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Phe Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>21
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>21
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Thr Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Val Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Leu Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>22
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>22
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ala Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Phe Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Thr Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Ala Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>23
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>23
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Thr Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Arg Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Asn Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>24
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>24
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Lys Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>25
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>25
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp Arg Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Met Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Leu Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>26
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>26
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Leu Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Cys Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>27
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>27
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgctccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgtta 540
agtgtggaca agacacagga tgaggtgacg acagcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgctggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>28
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>28
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gaccgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tctttaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgctttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgccctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>29
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>29
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcaccaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgtgctcct tcagcacccg 780
gagcagatcg ctcttctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>30
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>30
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
tttcctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcgccaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc tgatcacgcg ttcttccaag aacagaccac gatcatgttt 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cctaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacaccg cgttgctttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgccctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggagcgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>31
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>31
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcaccaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtccg gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctaatctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>32
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>32
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc tgatcacgcg ttcttccaag agcagaccac gatcatgctg 540
agtgtggaca agacacagga taaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>33
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>33
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa360
ctcgaaccgc ggatcgtcca gatcaccgag gaccgcctcg acgcgatggc caaagcagga420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgctt540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acatggagct cgaaggccag900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc1080
ggcctcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>34
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>34
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctaccccc tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgttg 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg cttgcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>35
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>35
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Ser Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>36
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>36
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Phe Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Ash Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>37
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>37
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Thr Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Val Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Leu Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>38
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>38
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ala Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Phe Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Thr Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Ala Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>39
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>39
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Thr Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Arg Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Asn Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>40
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>40
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Lys Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>41
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>41
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp Arg Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Met Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Leu Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>42
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>42
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Leu Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Cys Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>43
<211>16
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>43
Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn Met Asp Ala Pro Glu His
1 5 10 15
<210>44
<211>18
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>44
Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala Gly His Glu Thr Thr
1 5 10 15
Ala Asn
<210>45
<211>13
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>45
Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn Arg
1 5 10
<210>46
<211>13
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>46
Glu Glu Ile Lys Leu Arg Glu Lys Ser Ala Ala Tyr Gly
1 5 10
<210>47
<211>12
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>47
Phe Gln Glu Gln Thr Thr Ile Met Ala Ser Val Asp
1 5 10
<210>48
<211>16
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>48
Lys Ala Pro Ala Pro Gly Trp Phe Ala Asn Met Asp Ala Pro Glu His
1 5 10 15
<210>49
<211>16
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>49
Lys Ala Pro Ala Pro Gly Trp Phe Thr Asn Met Asp Ala Pro Glu His
1 5 10 15
<210>50
<211>16
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>50
Lys Ala Pro Ala Pro Gly Trp Phe Phe Asn Met Asp Ala Pro Glu His
1 5 10 15
<210>51
<211>18
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>51
Leu Thr Asn Thr Ala Leu Leu Leu Leu Val Ala Gly His Glu Thr Thr
1 5 10 15
Ala Asn
<210>52
<211>18
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>52
Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala Gly His Glu Thr Thr
1 5 10 15
Ala Asn
<210>53
<211>18
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>53
Leu Thr Asn Ile Ala Leu Pro Leu Leu Val Ala Gly His Glu Thr Thr
1 5 10 15
Ala Asn
<210>54
<211>13
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>54
Arg Tyr Leu Ser Ile Val His Leu Gly Ala Pro Asn Arg
1 5 10
<210>55
<211>13
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>55
Glu Glu Ile Lys Leu Arg Glu Lys Ser Thr Ala Tyr Gly
1 5 10
<210>56
<211>12
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>56
Phe Gln Glu Gln Thr Thr Ile Met Thr Ser Val Asp
1 5 10
<210>57
<211>12
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>57
Phe Gln Glu Gln Thr Thr Ile Met Val Ser Val Asp
1 5 10
<210>58
<211>12
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>58
Phe Gln Glu Gln Thr Thr Ile Met Leu Ser Val Asp
1 5 10
<210>59
<211>12
<212>PRT
<213>人工
<220>
<223>合成蛋白質(zhì)
<400>59
Phe Gln Glu Gln Thr Thr Ile Met Phe Ser Val Asp
1 5 10
權(quán)利要求
1.選自下組的多肽,所述組由以下組成具有根據(jù)SEQ ID NO 3、SEQ ID NO 6、SEQ ID NO 43-59的氨基酸序列的多肽,具有與SEQ IDNO 3至少50%同一性程度的氨基酸序列的多肽,具有與SEQ ID NO 6至少60%同一性程度的氨基酸序列的多肽,和具有與SEQ ID NO 43-59差異不多于3個氨基酸的氨基酸序列的多肽。
2.根據(jù)權(quán)利要求1的多肽,其具有根據(jù)SEQ ID NO 3、SEQ ID NO6、SEQ ID NO 19-26或SEQ ID NO 35-59的氨基酸序列,或具有與SEQID NO 3、SEQ ID NO 6、SEQ ID NO 19-26或SEQ ID NO 35-59至少90%同一性程度的氨基酸序列。
3.能夠以至少50%的效率將制甲羥酶素轉(zhuǎn)化為普伐他汀的多肽。
4.多核苷酸,其包含編碼權(quán)利要求1到3中任一項的多肽的DNA序列。
5.權(quán)利要求4的多肽,其為SEQ ID NO 1、2、4或5。
6.用于生產(chǎn)普伐他汀的方法,包括步驟
(i)在生產(chǎn)宿主中表達(dá)權(quán)利要求4到5中任一項的多核苷酸;
(ii)培養(yǎng)在步驟(i)中獲得的所述生產(chǎn)宿主;
(iii)從步驟(ii)中獲得的混合物中分離普伐他汀。
7.用于分離編碼下述多肽的多核苷酸的方法,所述多肽能夠促進(jìn)制甲羥酶素成為普伐他汀的轉(zhuǎn)化,所述方法包括步驟
(i)用權(quán)利要求4到5中任一項的多核苷酸轉(zhuǎn)化宿主細(xì)胞;
(ii)針對其羥基化制甲羥酶素的能力選擇經(jīng)轉(zhuǎn)化的細(xì)胞的克??;
(iii)用多種多核苷酸再轉(zhuǎn)化這些經(jīng)分離的克?。?br>
(iv)針對其羥基化制甲羥酶素的能力選擇經(jīng)轉(zhuǎn)化的細(xì)胞的克?。?br>
(v)分離質(zhì)粒;
(vi)對所述質(zhì)粒的插入物進(jìn)行測序。
8.根據(jù)權(quán)利要求6的方法,還包括在步驟(i)中獲得的生產(chǎn)宿主中共同表達(dá)根據(jù)權(quán)利要求7的經(jīng)分離的多核苷酸。
9.根據(jù)權(quán)利要求8的方法,其中在所述生產(chǎn)宿主的生長期間添加制甲羥酶素。
10.根據(jù)權(quán)利要求8到9中任一項的方法,其中所述生產(chǎn)宿主是真菌細(xì)胞或細(xì)菌細(xì)胞。
11.根據(jù)權(quán)利要求10的方法,其中所述真菌細(xì)胞是酵母或絲狀真菌細(xì)胞,以及,所述細(xì)菌細(xì)胞選自由放線菌和變形菌組成的組。
12.根據(jù)權(quán)利要求11的方法,其中所述酵母是Saccharomycescerevisiae、Hansenula polymorpha、Kluyveromyces lactis或Pichiapastoris,所述絲狀真菌細(xì)胞是Aspergillus terreus、Aspergillus nidulans、Aspergillus niger、Penicillium citrinum、Penicillium chrysogenum、Monascus ruber或Monascus paxii,所述放線菌是Streptomyces、Amycolatopsis或Actinomadura,所述變形菌是Escherichia或Bacillus。
13.根據(jù)權(quán)利要求12的方法,其中所述Streptomyces是Streptomycescarbophilus、Streptomyces lividans、Streptomyces coelicolor或Streptomycesclavuligerus,所述Amycolatopsis是Amycolatopsis orientalis,所述Escherichia是Escherichia coli,所述Bacillus是Bacillusamyloliquefaciens、Bacillus licheniformis或Bacillus subtilis。
14.藥物組合物,其包含根據(jù)權(quán)利要求6和8到13中任一項獲得的普伐他汀。
全文摘要
本發(fā)明提供了具有根據(jù)SEQ ID NO 3、SEQ ID NO 6或SEQ ID NO43-59的氨基酸序列的多肽。本發(fā)明還提供了包含編碼這些多肽的DNA序列的多核苷酸用于分離編碼下述多肽的多核苷酸的方法,所述多肽能夠促進(jìn)制甲羥酶素向普伐他汀的轉(zhuǎn)化。另外,本發(fā)明提供了生產(chǎn)普伐他汀和包含普伐他汀的藥物組合物的方法。
文檔編號C12P7/62GK101558152SQ200780046270
公開日2009年10月14日 申請日期2007年12月11日 優(yōu)先權(quán)日2006年12月13日
發(fā)明者保羅·克萊斯森, 阿德里安努斯·維爾赫穆斯·赫曼努斯·沃勒布里吉特, 馬爾科·亞田山大·范德勃戈, 馬庫斯·漢斯, 簡·米特斯卡·范德拉恩 申請人:帝斯曼知識產(chǎn)權(quán)資產(chǎn)管理有限公司